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(57) Abstract: A targeting polypeptide is provided that may be used to target a chosen antigen to an antigen presenting cell. Com- 
plexes comprising such targeting polypeptide and antigen, nucleic acids and vectors encoding them, and cells comprising the nucleic 
acids and vectors may be used in methods of immunisation and enhance the immunogenicity of the antigen. 
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TARGETING POLYPEPTIDE 

Field of the Invention 

The present invention relates to targeting polypeptides and their use in 
targeting antigens to antigen presenting cells (APCs). 



Background of the Invention 

Staphylococcus aureus is a major causative agent of community and hospital 
acquired infections worldwide. The organism is an important pathogen due to a 

10 combination of invasiveness, toxin production, and antibiotic resistance. S. aureus 

causes a wide variety of clinical syndromes, ranging from uncomplicated infections of 
the skin to life-threatening toxic shock syndrome (TSS). The bacterium causes disease 
by producing large numbers of exoproteins or virulence factors. Among the many 
known virulence factors are two families of staphylococcal pyrogenic superantigens, 

15 staphylococcal enterotoxins (SEs) and toxic shock syndrome toxin- 1 (TSST- • }. 

Superantigens bind to major histocompatibility complex (MHC) class-II on 
antigen presenting cells (B cells, monocytes, and dendritic cells) outside the classical 
antigen-binding groove, and activate T cells by binding with the variable region of the 
P chain of T cell receptors (VP-TCR). This cross-linking triggers the non-specific 

20 activation and proliferation of T cells, induces the production of high levels of a 

variety of cytokines, and causes toxic shock syndrome characterized by fever, rash, 
hypotension and multiple organ failure. Staphylococcal enterotoxins are responsible 
for many cases of food poisoning (intoxication) associated with ingestion of toxin- 
contaminated food. To date, more than thirteen staphylococcal enterotoxins have been 

25 described. 

As well as these classical superantigens, S. aureus also produces a family of 
proteins that have sequence homology to the superantigens, these proteins are known 
as the staphylococcal exotoxin-like proteins (SETs) and are a family of polymorphic 
paralogs. They were first identified as a genetic locus encoding at least five exotoxin- 
30 like proteins (SET1-5). More recently, data from the sequencing of the genomes of 
several different S. aureus strains has revealed a large number of related (36-67%) set 
genes clustered on a genomic island. This putative pathogenicity island, which is 
present in all strains of S. aureus examined to date, codes for between seven and 
fourteen set genes, which have varying degrees of sequence homology. In addition, 
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there appears to be extensive inter-strain allelic polymorphism for each of the set 
genes. The International Nomenclature Committee for Staphylococcal Superantigen 
Nomenclature (1NCSS) has recently recommended that the SETs be renamed 
staphylococcal superantigen-like exoproteins (SSLs) and numbered from SSL1 to 
5 SSL14 in clockwise order from the replication origin of the chromosome based on 
homology to the full complement of genes found in strain MW2. This nomenclature is 
essentially as described by Fitzgerald et al {Infect. Immun., 2003, 71, 2827-2838) 
except that the numbering of the genes is in the opposite direction. To differentiate 
between allelic- variants the ssl gene is prefixed by the strain name. 

1 0 The three-dimensional structure of one member of the family, SSL5 

(previously SET3) has been determined. The crystal structure of this protein shows 
many of the characteristic structures of the superantigen superfamily, but significant 
differences also exist. In addition, SSLs do not show the main properties of classical 
superantigens, such as polyclonal T cell activation, pyrogenicity, or enhancement of 

1 5 endotoxin shock . The function of SSLs is therefore unknown. 



Summary of the Invention 

The present invention is based on the finding that Staphyloccal superantigen- 
like exoproteins (SSLs) are able to target themselves to antigen presenting cells 
(APCs). This targeting to antigen presenting cells and hence the antigen presentation 
pathway of these cells means that a chosen antigen can also be targeted to the antigen 
presenting cell facilitating presentation of the antigen and hence increasing 
immunogenicity. The invention may also be used to target antigens to antigen 
presenting cells in order to induce tolerance. 

Accordingly, me invention provides for the use of a complex comprising: 
(a) a targeting polypeptide comprising a staphylococcal superantigen-like 
protein (SSL), a fragment thereof or a variant of either, where the SSL, fragment or 
variant has the ability to target the complex to an antigen presenting cell; and 
(b) an antigen and/or a nucleic acid molecule encoding an antigen, 

in the manufacture of a medicament for use in immunization or the induction 
of tolerance. 

The invention also provides a complex comprising: 
(i) a targeting polypeptide as defined in any one of the preceding claims; and 
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(ii) an antigen or a nucleic acid encoding an antigen, wherein the antigen or 
encoded antigen is selected from a pathogenic antigen, auto-antigen, an allergen and a 
cancer antigen. 

The invention also provides a virus comprising a targeting polypeptide of the 



In addition, the invention provides: 

a nucleic acid molecule comprising a polynucleotide sequence encoding a 
targeting polypeptide and an antigen selected from a pathogenic antigen, auto-antigen, 
an allergen and a cancer antigen; 



a cell comprising a nucleic acid or a vector of the invention or infected with a 
virus of the invention. 

The invention also provides a method of loading antigen presenting cells 
comprising contacting an antigen presenting cell with a complex or virus of the 
15 invention. An antigen presenting cell which has been loaded with a complex or virus 
is also provided. 

The invention additionally provides: 

a pharmaceutical composition comprising a complex of the invention, a 
nucleic acid encoding a targeting polypeptide and antigen of a complex of the 
20 invention, a vector comprising such a nucleic acid, a cell comprising such a nucleic 
acid or vector, a virus of the invention or an antigen presenting cell of the invention 
and a pharmaceutically acceptable carrier or diluent; 

a vaccine comprising a complex of the invention, a nucleic acid encoding the 
targeting polypeptide and antigen of a complex of the invention, a vector comprising 
25 such a nucleic acid, a cell comprising such a nucleic acid or vector, a virus of the 
invention or an antigen presenting cell of the invention; and 

a complex of the invention, a nucleic acid encoding a targeting polypeptide 
and antigen of a complex of the invention, a vector comprising such a nucleic acid, a 
cell comprising such a nucleic acid or vector, a virus of the invention, or an antigen 
30 presenting cell of the invention for use in a method of treatment of the human or 
animal body by therapy. 

The invention also provides for the use of a nucleic acid encoding a targeting 
polypeptide and antigen of a complex of the invention, a vector comprising such a 
nucleic acid, a cell comprising such a nucleic acid or vector, a virus of the invention 
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invention 



10 



a vector comprising a nucleic acid of the invention; and 
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or an antigen presenting cell of the invention in the manufacture of a medicament for 
use in immunisation. 

The invention further provides a method of immunising a subject, the method 
comprising administering an effective amount of a complex of the invention, a nucleic 
5 acid encoding a targeting polypeptide and antigen of a complex of the invention, a 
vector comprising such a nucleic acid, a cell comprising such a nucleic acid or 
vector, a virus of the invention or an antigen presenting cell of the invention to a 
subject. 

The invention also provides an agent for immunising a subject, the agent 
10 comprising a complex of the invention, a nucleic acid encoding a targeting 

polypeptide and antigen of a complex of the invention, a vector comprising such a 
nucleic acid, a cell comprising such a nucleic acid or vector, a virus of the invention 
or an antigen presenting cell of the invention. 

15 Brief Description of the Figures 

Figure 1 - Panel (a) shows the structure of SSL7, shaded from white (N- 
terminal) to dark (C-terminal). Panel (b) shows the structure of SSL7 (dark) optimally 
superposed on that of SSL5 (grey) and the same structure is shown on the right 
rotated by 90°. Panel (c) shows the structure of SSL7 (dark) optimally superposed on 

20 that of SPEC (grey) and the same structure is shown on the right rotated by 90°. Panel 
(d) shows the SSL7 dimer. 

Figure 2 - shows residues in the SSL7 dimer interface with residue numbers 
being given on the x-axis, and buried surface area (A 2 ) on the y-axis. The two forms 
are show by broken and unbroken lines. 

25 Figure 3 - Panel (a) shows the results of F ACS analysis of PBMCs (peripheral 

blood mononuclear cells) incubated with varying concentrations of SSL-7-FITC at 
4°C and 37°C. The graph is representative of ten separate experiments carried out with 
PBMCs from healthy donors. Panel (b) shows the results of competition experiments 
between SSL9-FITC (top) and SSL7-FITC (bottom) with various molecules for 

30 binding to PBMCs. Results shown are from one representative experiment of three. 

Figure 4 - shows the results of FACS analysis of PBMC stained with SSL7 
(top) or SSL9 (bottom) and various markers. The results shown are representative of 
five separate experiments. 
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Figure 5 - shows the results of FACS analysis of PBMC-derived dendritic 
cells stained with SSL9, SSL7 or no stain. The results are representative of a set of 
three separate experiments on cells from healthy donors. 

Figure 6 - shows that SSL and SSL9 interact selectively with dendritic cells. 
5 FACS results for unpurified Dendritic cells incubated with SSL7-FITC or SSL9-FITC 
and then stained for CD la are shown. The numbers show the percentage of SSL 
positive cells that were also CD la positive. 

Figure 7 - shows that SSLs do not alter Dendritic cell morphology or cell 
surface phenotype. Panel (A) shows the morphology of Dendritic cells treated with 
10 SSL proteins. Results for lipopolysaccharide (LPS) and peptidoglycan (PG) are 

shown as positive controls. Results are representative of three experiments. Panel 
(B) shows FACS results for phenotypic analysis of dendritic cells treated with SSL 
proteins. Data are shown for expression of cell surface molecules on Dendritic cells 
that have been treated with SSL7 or SSL9. LPS and PG were used as positive 
15 controls. Expression of the indicated markers is shown by the solid histograms, 
whereas cells stained with relevant control mAb are indicated by the open line 
histograms. The numbers on each histogram correspond to the median fluorescence 
intensity (MFI) of mAb staining. Results shown are from one donor and are 
representative of similar data obtained from experiments carried out with dendritic 
20 cells from four different donors. 

Figure 8 — shows endocytosis of FITC-Dextran by Dendritic cells exposed to 
SSL proteins. The results of one of three separate experiments are shown. 

Figure 9 - shows that the effect of SSL protein on the T cell stimulatory 
capacity of dendritic cells. Panel (A) shows the effect of SSL7 (black bars) or SSL9 
25 (empty bars) or medium alone (grey bars) on stimulation of autologous T cells in the 
presence or absence (M) of purified protein derivative (PPD). Data shown are mean 
± SEM of 3 experiments. 

Figure 10 - shows autologous T cell responses to Dendritic cells loaded with 
SSL7 or SSL9 protein. Data are mean ± SD of triplicate cultures from individual 
30 experiments. 

Figure 1 1 - shows the effects of SSL protein on cytokine production. Purified 
protein derivative is used as control. 

Figure 12 - shows antibody responses in human sera. Panel (A) show levels 
of SSL7 or SSL9 measured by ELISA. Results for a polyclonal rabbit antibody raised 
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against purified His-tagged SSL7 are included as a positive control. Data are 
representative of 3 experiments. Panels (B) and (C) show the results of a competitive 
ELISA with plates coated with either SSL7 (B) or SSL9 (C) where serum diluted . 
1 :2000 (final dilution) was mixed with varying concentrations of SSL7, SSL9 or 
5 Emblp32 as indicated. 

Brief Description of the Sequences 

SEQ ID No: 1 provides the nucleotide sequence of a genomic fragment 
comprising the pathogenicity island SaPIn2 from S. aureus strain N315 which 
10 includes the ssll to ssl5 and ssl7 to sslll genes. 

SEQ ID Nos: 2 to 1 1 provide the amino acid sequences of SSL1 to SSL7 and 
SSL7 to SSL 1 1 from S. aureus strain N315 and also indicate the location of the CDS 
of the genes in SEQ ID No:l. 

SEQ ID No: 12 provides the nucleotide sequence of a genomic fragment from 
15 S. aureus strain N315, which includes the ssll2 to ssll4 genes. The sequence 
indicated is the complement of the coding strand. The order of the genes in the 
indicated sequence going from 5' to 3' is ssll4, ssl 13 and sslll. The complementary 
sequence is given by SEQ ID No: 103. 

SEQ ID Nos: 13 to 15 provide the amino acid sequences of SSL14, SSL 13 
20 and SSL12 respectively from S. aureus strain N3 1 5 and also indicate the location of 
the CDS of the genes in SEQ ID No: 12. The start of each coding sequence indicated 
is the higher nucleotide position listed. 

SEQ ID No: 16 provides the nucleotide sequence of a genomic fragment 
comprising the pathogenicity island SaPIn2 from S. aureus strain Mu50, which 
25 includes the ssll to 3, ssl5 and ssl7 to sslll genes. 

SEQ ID Nos: 17 to 25 provide the amino acid sequences of SSL1 to SSL3, 
SSL5 and SSL 7 to 1 1 respectively from S. aureus strain Mu50 and also indicate the 
location of the CDS of the genes in SEQ ID No: 16. 

SEQ ID Nos: 26 to 36 provide the amino acid sequences of SSL1 to SSL1 1 
30 respectively from S. aureus strain MW2 and also indicate the location of the CDS in 
SEQ ID No: 103. 

SEQ ID No: 37 provides the nucleotide sequence of a genomic fragment from 
S. aureus strain NCTC8325, and includes the sequences of the ssll to sslll genes. 
The sequence indicated is the complement of the coding strand. The order of the 
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genes in the indicated sequence going from 5 ' to 3' is sslll to ssll. The 
complementary DNA sequence is given by SEQ ID No: 104. 

SEQ ID Nos: 38 to 48 provide the amino acid sequences of SSL1 to 1 1 
respectively from S. aureus strain NCTC8325 and also indicate the location of the 
5 CDS of the genes in SEQ ID No:37. The start of each coding sequence indicated is 
the higher nucleotide position listed. 

SEQ ID No: 49 provides the nucleotide sequence of a genomic fragment from 
S. aureus strain NCTC8325, and includes the sequences of the ssll 2 to ssll4 genes. 

SEQ ID Nos: 50 to 52 provide the amino acid sequences of SSL12 to 14 
10 respectively from S. aureus strain NCTC8325 and also indicate the location of the 
CDS of the genes in SEQ ID No:49. 

SEQ ID No: 53 provides the nucleotide sequence of a genomic fragment from 
S. aureus strain EMRSA 16(252), which includes the sequences of the ssll to ssl5, 
ssl7, ssl9 to sslll genes. The sequence indicated is the complement of the coding 
15 strand. The order of the genes in the indicated sequence going from 5' to 3 9 is sslll to 
ssl9, ssl7 and ssl5 to ssll. The complementary DNA sequence is given by SEQ ID 
No:105. 

SEQ ID Nos: 54 to 62 provide the amino acid sequences of SSL1 to SSL5, 
SSL7, SSL9 to SSL1 1 respectively from S. aureus strain EMRSA 16(252) and also 
20 indicate the location of the CDS of the genes in SEQ ID No:53. The start of each 
coding sequence indicated is the higher nucleotide position listed. 

SEQ ID No: 63 provides the nucleotide sequence of a genomic fragment from 
S. aureus strain EMRSA 16(252) and includes the sequences of the ssll 2 to 14 genes. 
The sequence indicated is the complement of the coding strand. The order of the 
25 genes in the indicated sequence going from 5' to 3' is ssll 4 to ssll 2. The 
complementary sequence is given by SEQ ID No: 106. 

SEQ ID No: 64 to 66 provide the amino acid sequences of SSL12 to 14 
respectively from S. aureus strain EMRSA 16(252) and also indicate the location of 
the CDS of the genes in SEQ ID No:63. The start of each coding sequence indicated 
30 is the higher nucleotide position listed. 

SEQ ID No: 67 provides the nucleotide sequence of a genomic fragment from 
S> aureus strain MSSA-476 which includes the sslll to sslll genes. 

SEQ ID No: 68 to 78 provide amino acid sequences of SSL1 to SSL1 1 
respectively from S. aureus strain MSSA-476 and also indicate the location of the 
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CDS of the genes in SEQ ID No:67. 

SEQ ID No: 79 provides the nucleotide sequence of a genomic fragment from 
S. aureus strain MSSA-476 which includes the sequences of the ssll2 to ssll4 genes. 

SEQ ID No: 80 to 82 provides the amino acid sequences of SSL12 to SSL 14 
from S. aureus strain MSSA-476 and also indicate the location of the CDS of the 
genes in SEQ ID No:79. 

SEQ ID Nos: 83 and 84 provide the nucleotide and amino acid sequences 
respectively of sslll from S. aureus strain COL. 

SEQ ID Nos: 85 and 86 provide the nucleotide and amino acid sequences 
respectively of ssll2 from S. aureus strain COL. 

SEQ ID Nos: 87 and 88 provide the nucleotide and amino acid sequences 
respectively of ssll3 from S. aureus strain COL. 

SEQ ID Nos: 89 and 90 provide the nucleotide and amino acid sequences 
respectively of ssll4 from S. aureus strain COL. 

SEQ ID Nos: 91 and 92 provide the nucleotide and amino acid sequences 
respectively of ssl9 from S. aureus strain COL. 

SEQ ID Nos: 93 and 94 provide the nucleotide and amino acid sequences 
respectively of ssllO from S. aureus strain COL. 

SEQ ID Nos: 95 and 96 provide the nucleotide and amino acid sequences 
respectively of sslll from S. aureus strain COL. 

SEQ ID Nos: 97 and 98 provide the nucleotide and amino acid sequences 
respectively of ssll2 from S. aureus strain COL. 

SEQ ID Nos: 99 and 100 provide the nucleotide and amino acid sequences 
respectively of ssl!3 from S. aureus strain COL. 

SEQ ID Nos: 101 and 102 provide the nucleotide and amino acid sequences 
respectively of ssll4 from S. aureus strain COL. 

SEQ ID No: 103 provides the complementary sequence to SEQ ID No: 12. 

SEQ ID No: 104 provides the complementary sequence to SEQ ID No: 37. 

SEQ ID No: 105 provides the complementary sequence to SEQ ID No: 53. 

SEQ ID No: 106 provides the complementary sequence to SEQ ID No: 63. 

SEQ ID No: 107 provides the nucleotide sequence of a genomic fragment 
from S.aureus strain MW2 which includes the ssll to sslll genes. The amino acid 
sequences of the encoded proteins are provided by SEQ ID Nos: 26 to 36 which 
indicate the position of the coding sequences in SEQ ID NO: 107. 
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Detailed Description of the invention 

Throughout the present specification and the accompanying claims the words 
"comprise" and "include" and variations such as "comprises", "comprising", 
5 "includes" and "including" are to be interpreted inclusively. That is, these words are 
intended to convey the possible inclusion of other elements or integers not specifically 
recited, where the context allows. In some cases, where specific constituents are 
recited, the embodiment may, for example, consist essentially of such constituents. 
The present invention is based on the finding that SSLs are able to target 

1 0 themselves to antigen presenting cells (APCs). The targeting to antigen presenting 
cells leads to the SSLs entering the antigen presenting pathway of the antigen 
presenting cells and being presented on the surface of the antigen presenting cells. 
This enhances the immunogenicity of the SSLs. The invention uses targeting 
polypeptides employing, or based on, the sequence of SSLs to target a chosen antigen 

15 to antigen presenting cells to enhance the immimogenicity of the chosen antigen. In 
effect, the targeting polypeptides are used to deliver a chosen antigen to an antigen 
presenting cell. 

In some instances the targeting polypeptides are used to deliver a nucleic acid 
molecule encoding the antigen, rather than the antigen itself. The nucleic acid 
20 molecule will then give rise to expression of the antigen in the antigen presenting cell 
and the subsequent presentation of the antigen. 

The invention employs targeting polypeptides comprising the polypeptide 
sequence of an SSL, a fragment of an SSL, or a variant sequence based on either, to 
target a chosen antigen or encoding nucleic acid to an APC. The invention uses 
25 complexes comprising the targeting polypeptide and antigen or an encoding nucleic 
acid to achieve delivery of the chosen antigen to an antigen presenting cell. 

Targeting polypeptides 

The targeting polypeptides employed comprise a staphylococcal superantigen- 
30 like protein (SSL), a fragment thereof or a variant of either, where the SSL, fragment 
or variant has the ability to target the targeting polypeptide to an antigen presenting 
cell (APC). The ability to target a polypeptide to an antigen presenting cell is referred 
to herein as targeting activity. 
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The targeting polypeptide may comprise the sequence of any naturally 
occurring SSL polypeptide. Such polypeptides may typically be isolated from 
Staphylococcus aureus. 

The amino acid sequences of SSLs 1 to 14 from a variety of strains of 
5 Staphylococcus aureus are provided herein and any of these sequences may be present 
in a targeting polypeptide of the invention. The Table below indicates the 
corresponding SEQ ID Nos for the SSLs. 

Table - SSLs 

10 





Staphylococcus aureus strain and SEQ ID No for SSL 


SSL 


N315 


Mu50 


MW2 


NCTC 
8325 


EMRSA 
16(252) 


MSSA 
476 


COL 


1 


2 


17 


26 


38 


54 


68 


84 


2 


3 


18 


27 


39 


55 


69 


86 


3 


4 


19 


28 


40 


56 


70 


88 


4 


5 


" ; ■ 


29 


41 


57 


71 


90 


5 


6 


20 


30 


42 


58 


72 


V. 


O 


V"' 




31 


43 




73 




7 


7 


21 


32 


44 


59 


74 




8 


8 


22 


33 


45 




75 


>•>••''• • 

• " ' , .... ^ 


9 


9 


23 


34 


46 


60 


76 


92 


10 


10 


24 


35 


47 


61 


77 


94 




Staphylococcus aureus strain and SEQ ID No for SSL 


SSL 


N315 


Mu50 


MW2 


NCTC 
8325 


EMRSA 
16(252) 


MSSA 
476 


COL 


11 


11 


25 


36 


48 


62 


78 


96 


12 


15 






50 


64 


80 


98 


13 


14 






51 


65 


81 


100 


14 


13 






52 


66 


82 


102 
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The SSL sequence employed may be one or more of SSL1 to SSL 14. In a 
preferred case where, for example, it is desired to use the sequence of an SSL9, the 
SSL9 will be have the sequence of one of the SSL9 indicated in Table 1 i.e. will be 
5 selected from SEQ ID Nos: 9, 23, 34, 46, 60, 76 and 92. In a preferred instance where 
is desired to employ the sequence of a SSL5, the SSL5 may be selected from the 
sequences of SEQ ID Nos: 6, 20, 30, 42, 58, and 72. In another preferred case, where 
it is desired to employ the sequence of a SSL 7, the SSL7 may be selected from the 
sequences of SEQ ID No: 7, 21, 32, 42, 59 and 74. Thus Table 1 indicates examples 

10 of preferred sequences for a particular SSL. 

In a particularly preferred embodiment the SSL sequence employed may be 
SSL5, SSL7 and/or SSL9 and may preferably be one of the specific SSL 7 and SSL9 
molecules whose sequence is provided herein. The SSL sequence may be a fragment 
of such a sequence. The SSL sequence employed may be a variant of the specific SSL 

15 7 and SSL9 molecules whose sequence is provided herein or may be a variant of a 
fragment of such a sequence. 

In some cases, the SSL employed may be an allelic form of an SSL as SSL 
genes from different Staphylococcus aureus strains vary in sequence. In some 
instances the SSL may be from one of the Staphylococcus aureus strains MW2, 

20 NCTC6571, FRI326 or NCTC8325, N315, Mu50, EMRSA 16(252), MSSA-476 or 
COL. In a particularly preferred embodiment the SSL may be from NCTC657 1 . 

The targeting polypeptide may comprise a fragment of a naturally occurring 
SSL where the fragment retains the ability to target the targeting polypeptide to an 
antigen presenting cell. Such fragments may comprise subregions of any one or more 

25 of the sequences of any of SSL1 to SSL14 and in particular of the sequences provided 
herein of particular SSL1 to SSL14 molecules. 

The targeting polypeptide may, for example, contain a sub-region of an SSL 
that is from 25 to 200, preferably from 35 to 175, even more preferably from 50 to 
150 and even more preferably 75 to 125 amino acids in length. The targeting 

30 polypeptide may comprise a sub-region of an SSL that is 220 or less, preferably 180 
or less, more preferably 160 or less, still more preferably 140 or less and even more 
preferably 120 or less amino acids in length. Such fragments may be derived from any 
SSL and in particular from the amino acid sequences indicated herein for SSL1 to 
SSL14. In some cases, the subregion may comprise at least 30, preferably at least 50, 
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more preferably at least 75 and even more preferably at least 100 amino acids from 
the SSL. In a preferred case the fragment may be from SSLS, SSL7 and/or SSL9. In a 
particularly preferred case the fragment may be from SSL9 and/or SSL7. 

The targeting polypeptide may comprise a variant sequence of an SSL or a 
5 fragment thereof. Such variant sequences will be able to target themselves to antigen 
presenting cells. Any suitable variant polypeptide capable of directing the targeting 
polypeptide to an antigen presenting cell may be employed. In some cases, the 
variant will have at least 25%, preferably at least 30%, more preferably at least 35%, 
still more preferably at least 40% and even more preferably at least 45% amino acid 

10 sequence identity to an SSL and in particular to an amino acid sequence of any one or 
more of the specific sequences provided herein for SSL 1 to 14. Thus the variant may 
have the specified level of sequence identity with the equivalent SSL whose sequence 
is provided herein. 

The level of amino acid sequence identity may be at least 50%, more 

15 preferably at least 55%, even more preferably at least 60% and still more preferably at 
least 65% amino acid sequence identity. The variant may, for example, have at least 
75%, preferably at least 80%, more preferably at least 85%, still more preferably at 
least 90% sequence identity and even more preferably at least 95% sequence identity 
to an SSL and in particular to one or more of the specific SSL1 to 14 molecules 

20 whose sequence is provided herein. 

In some cases the SSL employed may be an allelic variant of a known SSL 
gene including an allele of any of SSLs 1 to 14. Such variants may have a high degree 
of sequence identity to a known SSL allele and particular to one of those provided 
herein. Thus in some cases the variant may have at least 85%, preferably at least 

25 95%, more preferably at least 97%, and even more preferably at least 99% sequence 
identity to any of the SSLs mentioned herein. The amino acid sequences of alleles of 
SSL5, SSL7 and SSL9 are indicated in Table 1 and the targeting polypeptide may 
comprise one or more of these sequences. 

In instances where the targeting polypeptide comprises a variant sequence, the 

30 variant sequence may have one of the levels of sequence identity specified herein to 
more than one SSL. Thus, for example, the invention encompasses a variant of an 
SSL or a variant of a fragment of an SSL that has one of the specified levels of 
sequence identity to all of the SSLs whose sequence is provided herein or to a 
particular SSL. The sequence may have one of the specified levels sequence identity 
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to all of the alleles provided herein for a particular SSL whose sequence is provided 
herein. 

In some cases a variant sequence may have one of the specified levels of 
sequence identity to at least two, preferably at least three, more preferably at least five 
and even more preferably at least six of the sequences provided herein for a particular 
SSL. In a preferred embodiment it will have one of the levels of sequence identity 
specified to all of the sequences provided herein for SSL5, SSL7 and/or SSL9. The 
variant may in particular have one of the specified levels of sequence identity to all of 
the sequences provided herein for SSL7 and/or SSL9. In a preferred case, the variant 
may have the specified level of sequence identity to the alleles from strain 
NCTC6571.1 

The targeting polypeptide may comprise a variant sequence of a fragment of 
an SSL. Thus the fragment may be any of the lengths referred to herein. It may have 
any of the degrees of sequence identity referred to herein. In general a variant 
sequence may have the specified level of sequence identity over the whole of the 
variant, over at least 20, preferably at least 50, more preferably at least 75, and even 
more preferably over at least 100 amino acids. In the case of fragments, the sequence 
identity may be over any of the lengths specified herein and in particular over the 
entire fragment or targeting sequence. The variant or fragment has to be able to target 
the polypeptide to an antigen presentation cell. In cases where the targeting 
polypeptide comprises sequences, other than those responsible for targeting, the level 
of sequence identity may be over the region corresponding to the SSL, fragment 
thereof or variant of either. The specified level of sequence identity may be over the 
minimum region in the polypeptide necessary for targeting to the antigen presenting 
cell. 

The variant may have amino acid substitutions in comparison to the normal 
sequence of the SSL. For example, it may have from 1, 2, 3 or more substitutions such 
as from 5 to 10, 10 to 20, 20 to 30 or more amino acid substitutions. The variant may 
have, in addition or alternatively, such numbers of amino acids deleted or inserted 
into it in comparison to the normal SSL. The amino acid changes may be conservative 
substitutions, for example according to the following Table. Amino acids in the same 
block in the second column and preferably in the same line in the third column may 
be substituted for each other. 
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The targeting polypeptide may comprise additional sequences to those 
5 responsible for targeting to an antigen presenting cell. In some cases the targeting 
polypeptide will consist essentially of the targeting sequences. In cases where there 
are additional sequences present, they may serve a variety of roles. In a particularly 
preferred embodiment, the targeting polypeptide will also comprise the antigen it is 
desired to target to the immune cells. The antigen may be any of those discussed 
10 herein. 

Polypeptide sequences may be present which separate the various regions of 
the targeting polypeptide. For example, regions may be present to separate the region 
responsible for targeting from an antigen. Polypeptide sequences may be present 
which allow for purification of the polypeptide such as, for example, a histidine tag or 
15 an antibody recognition site. The targeting polypeptide may include an enzymatic 
cleavage site and in particular a protease recognition site may be present in the 
targeting polypeptide. It may be desirable to be able to remove sequences from the 
polypeptide such as those used to purify the polypeptide and cleavage sites may be 
employed. 
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The targeting polypeptide may lack specific sequences, for example because 
they have been cleaved after they are no longer of use. Thus in a preferred instance 
the targeting polypeptide may, for example, lack a Histidine tag and hence lack four, 
five, six, seven, eight, ten or more consecutive histidine residues typically at the 
5 terminus of the protein. The targeting polypeptide may lack an antibody recognition 
sequence used in purification, a reporter sequence and/or sequences such as 
thioredoxin or Pho A sequences. In some embodiments of the invention such 
sequences will not be present at any stage and will not be encoded by the nucleic 
acids of the invention. 

1 0 The targeting polypeptide may comprise viral sequences. A targeting 

polypeptide of the invention may be used to target a viral particle to an antigen 
presenting cell. In such cases, the targeting polypeptide will typically be provided 
wholly or partially on the surface of the virus to allow the targeting sequence to target 
the virus to an antigen presenting cell. The targeting polypeptide may increase the 

15 affinity of the viral particle for the antigen presenting cell and may mean that the viral 
particle is more selective in binding to antigen presenting cells as opposed to other 
cell types. 

The viral sequences may be, or comprise sequences from, surface proteins or 
polypeptide sequences which allow the targeting polypeptide to be displayed on the 

20 surface of the virus. The targeting polypeptide may comprise trans-membrane 
sequences to allow the targeting polypeptide to be present in a membrane and in 
particular in a viral membrane. 

The targeting sequence may comprise linker sequences to allow different 
domains in the targeting polypeptide to function. In one case the targeting polypeptide 

25 may comprise a proline rich linker and in particular all or part of the proline rich 
sequence from the 4070A protein from a leukaemia virus surface protein. In cases 
where the targeting polypeptide is employed to target a viral particle to an antigen 
presenting cell the targeting sequences may be present in the targeting polypeptide 
together with the sequences necessary for fusion of the membranes of the virus and 

30 antigen presenting cell to allow entry of the virus into the cell. 

The targeting polypeptide may comprise a protease cleavage site which will 
allow cleavage of the part of the targeting polypeptide and in particular a protease site 
which may be cleaved by a protease present on the surface of the antigen presenting 
cell. 
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In a preferred embodiment, the targeting polypeptide may comprise the 
sequence of SSL 5, 7, or 9 and in a particularly preferred case the sequence of SSL 7 
or 9 may be employed. In some cases the sequence of SSL 7 may be employed, 
particularly where It is desired to target B cells such as CD19 + B cells. In other cases, 
such as for example where it is not desired to target B cells, other SSLs may be 
employed and in particular SSL9 may be employed. In each case fragments of such 
sequence or variants of either may be employed including any of those referred to 
herein. 

In a preferred embodiment of the invention the targeting polypeptide 
comprises: 

(a) a SSL polypeptide having the amino acid sequence of any of SEQ ID Nos: 

2 to 11, 13 to 15, 17 to 36, 38 to 48, 50 to 52, 54 to 66, 68 to 78, 80 to 82, 84, 86,88, 
90, 92, 94, 96, 98, 100 and 102; 

(b) a fragment of any of the sequences of (a), the fragment having the ability 
to target the complex to an antigen presenting cell; and/or 

(c) a variant polypeptide having at least 30% amino acid sequence identity 
to any of the polypeptides of (a) or (b) and the ability to target the 
complex to an antigen presenting cell. 

In another the targeting polypeptide comprises: 

(a) the sequence of SEQ ID No: 7, 21, 32, 44, 59 and/or 74; 

(b) a fragment of any of the sequences of (a), the fragment having the ability 
to target the complex to an antigen presenting cell; and/or 

(c) a variant polypeptide having at least 70 % amino acid sequence identity 
to any of the polypeptides of (a) or (b) and the ability to target the 
complex to an antigen presenting cell. 

In another the targeting polypeptide comprises: 

(a) the sequence of SEQ ID No: 9, 23, 34, 46, 60, 76 and/or 92; 

(b) a fragment of any of the sequences of (a), the fragment having the ability 
to target the complex to an antigen presenting cell; and/or 

(c) a variant polypeptide having at least 70 % amino acid sequence identity 
to any of the polypeptides of (a) or (b) and the ability to target the 
complex to an antigen presenting cell. 
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In some cases a targeting polypeptide may comprise a plurality of sequences 
which individually would be able to lead to targeting to an antigen presenting cell. 
A minimal sequence necessary for targeting to an antigen presenting cell may be 
referred to as a targeting sequence. Such targeting sequences may be any of those 
5 discussed herein which have the ability to target a polypeptide to an antigen 

presenting cell. Thus a targeting polypeptide may comprise, for example, 1, 2, 3, 5 or 
more such targeting sequences. The targeting polypeptide may comprise any pair of 
those targeting sequences referred to herein. In some cases the targeting polypeptide 
may comprise different targeting sequences with different properties. 
10 In some cases the targeting polypeptide will be employed to target a viral 

particle to an antigen presenting cell. Thus the targeting polypeptide collectively 
present on the virus will be able to increase the affinity and/or specificity of the viral 
particle for an antigen presenting cell in comparison to an equivalent virus lacking the 
targeting polypeptide. 

15 

Antigens 

The targeting polypeptides of the invention are used to deliver a chosen 
antigen to an antigen presenting cell. The antigen may be any suitable antigen and 
20 typically will be a peptide or a polypeptide antigen. The antigen may, for example, be 
an antigen selected from a pathogenic antigen, an auto-antigen, an allergen and a 
cancer antigen 

In some cases the invention may be used to deliver a nucleic acid molecule 
encoding an antigen to an antigen presenting cell. The nucleic acid may then be 
25 expressed in the antigen presenting cell to give rise to the antigen and presentation of 
the antigen. Thus the invention also encompasses the targeting of a nucleic acid 
molecule encoding any of the antigens mentioned herein to an antigen presenting cell. 

In a preferred instance the antigen, may be an antigen from an infectious 
organism. The antigen may, for example, be derived from a virus, bacterium, parasite, 
30 protozoan, fungus, or prion. The antigen may be a surface antigen expressed on the 
surface of the pathogen or may be an intracellular antigen. The antigen may be from 
an intracelluar pathogen or alternatively an extracellular one. 

The antigen may, for example, be from a bacterium. It may be from a gram 
positive or a gram negative bacterium. The antigen may, for example, be: an antigen 
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from Mycobacterium (for example from Mycobacterium leprae, Mycobacterium 
tuberculosis, Mycobacterium avium, Mycobacterium intracellulars Mycobacterium 
kansaii, or Mycobacterium gordonae); Pseudomonas; Yersinia; Salmonella (for 
example from Salmonella typhiniurium); Helicobacter (for example from 
5 Helicobacter pylori); Borelia (for example from Borelia burgdorferi); Bordetella (for 
example from Bordtella pertussis or Bordetella parapertussis); Legionella (for 
example from Legionella pneumophilia); Staphylococcus (for example from 
Staphylococcus aureus); Neisseria (for example from Neisseria gonorrhoeae or 
Neisseria meningitides); Listeria (for example from Listeria monocytogenes); or 
10 Streptococcus (for example from Streptococcus pyogenes, Sfreptococcus agalactiae, 
Streptococcus viridans, Streptococcus faecalis, Sfreptococcus bovis, or Streptococcus 
pneumoniae). 

In some cases the antigen may be from: a Campylobacter; Enterococcus; 
Haemophilus (for example from Haemophilus influenzae); Bacillus (for example from 

15 Bacillus antracis); Corynebacterium (for example from Coiynebacterium 

diphtheriae); Eiysipelothrix (for example from Eiysipelothrix rhusiopathiae); 
Clostridium (for example from Clostridium perfringers, or Clostridium tetani); Vibrio 
(for example Vibrio cholerae); Enterobacter (for example from Enterobacter 
aerogenes); Klebsiella (for example from Klebsiella pneumoniae); Pasturella (for 

20 example from Pasturella multocida); Bacteroides; Fusobacterium (for example from 
Fusobacterium nucleatum); Streptobacillus (for example from Streptobacillus 
moniliformis); Shigella; Escherichia (for example from Escherichia coli); Rickettsia; 
Treponema (for example from Treponema palladium); Lactococcus; Lactobacillus; 
Brucella; Aeromona;, Franciesella; Citrobactor; Rhodococcus; Leishmania; or 

25 Strongylus (for example from Strongylus vulgaris). 

Examples of preferred bacterial antigens include: the Shigella sonnei form 1 
antigen; the Fl antigen of Yersinia pestis; antigens from Neisseria meningititidis and 
in particular those encoded by the GNA33, GNA2001, GNA1220 and GNA1946 
genes; the O-antigen of V. cholerae Inaba strain 569B; protective antigens of 

30 enterotoxigenic E.coli, such as fimbrial antigens including colonisation factor 

antigens, in particular CFA/I, CFA/n, and CFA/IV and the nontoxic B-subunit of the 
heat-labile toxin; pertactin of Bordetella pertussis, adenylate cyclase-hemolysin of B. 
pertussis; fragment C of tetanus toxin of ClosMdium tetani and the LT (heat labile 
enterotoxin) and ST (heat stable toxin) antigens. 
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In some instances the antigen may be a viral antigen. The antigen may, for 
example, be a viral coat protein, glycoprotein or other proteins expressed on the 
surface of a virus. The antigen may be from a Picomaviridae (for example from a 
polio virus, a hepatitis virus, an enterovirus, a coxsackie virus, a rhinovirus, or an 
5 echovirus); Calciviridae ; Togaviridae (for example from a equine encephalitis virus 
or a rubella virus); Flaviridae (for example from a dengue virus, an encephalitis virus, 
or a yellow fever virus); Coronaviridae (for example from a coronavirus); 
Rhabdoviridae (for example from a vesicular stomatitis virus, or a rabies virus); 
Filoviridae (for example from an ebola virus); Paramyxoviridae (for example from a 

1 0 parainfluenza virus, mumps virus, measles virus, or a respiratory syncytial virus); 

Orthomyxoviridae (for example from an influenza virus such as influenza types A, B 
and C); Bungaviridae (for example from a Hanta virus, bunga virus, phlebovirus or a 
Nairo virus); Arena viridae (for example from a hemorrhagic fever virus) ; Reoviridae 
(for example a rotavirus); Birnaviridae; Hepadnaviridae (for example a Hepatitis B 

15 virus); Parvoviridae; Papovaviridae (for example from a papilloma virus, or polyoma 
virus); Adenoviridae; Herpes viridae (for example from herpes simplex virus (HSV) 1 
or 2, varicella zoster virus, cytomegalovirus or a herpes viruses); Poxviridae (for 
example from a variola virus, vaccinia virus, or a pox virus); or an Iridoviridae (for 
example from African swine fever virus). 

20 In a preferred case the antigen may be from a Retro viradae (e. g., HTLV-I ; 

HTLV-1 1 ; or HIV-1 (also known as HTLV-1 1 1, LAV, ARV, hTLR, etc.)). In a 
particularly preferred case the antigen may be one derived from HTV and in particular 
the isolates HIVIllb, HEVSF2, HTVLAV, HIVLAI, HIVMN ; KTV-1CM235, HIV-1; 
or HTV-2. In a particularly preferred embodiment, the antigen may be a human 

25 immunodeficiency virus (HIV) antigen. Examples of preferred HTV antigens include, 
for example, gpl20, gap 160 gp41, gag antigens such as p24gag and p55gag, as well 
as proteins derived from the pol, env, tat, vif, rev, nef, vpr, vpu or LTR regions of 
HIV. In a particularly preferred case the antigen may be HIV gpl20 or a portion of 
HIV gpl20. The antigen maybe from an immunodeficiency virus, and may, for 

30 example, be from SIV or a feline immunodeficiency virus. 

In a preferred case the viral antigen may be one from a hepatitis virus such as 
an antigen from hepatitis A virus (HAV), hepatitis B virus (HBV), hepatitis C virus 
(HCV), the delta hepatitis virus (HDV), hepatitis E virus (HEV) or hepatitis G virus 
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(HGV). In a particularly preferred embodiment the antigen may be from hepatitis B 
virus (HB V) and may preferably be a surface or core antigen from HB V. 

In another preferred case the antigen may be from a herpesvirus family. 
Particular antigens include those from herpes simplex virus (HS V) types 1 and 2, such 
as HSV-1 and HSV-2 glycoproteins gB, gD and gH; antigens from varicella zoster 
virus (VZV), Epstein-Barr virus (EBV) and cytomegalovirus (CMV) including CMV 
gB and gH; and antigens from other human herpesviruses such as HHV6 and HHV7. 

The antigen may be a fungal antigen, such as one from Candida or Aspergilus. 
In particular, it may be from Candida albicans or Aspergillus fumigatus. The antigen 
may be from Sporothrix (e.g from Sporotkrix schenckii), Histoplasma (e.g. from 
Histoplasma capsulatum) Cryptococcus (e.g. from Qyptococcus neoformans) or 
Pneumocystis (e.g. from Pneumocystis carinii). The antigen may be from a parasitic 
pathogen and may, in particular, be from Taenia, Flukes, Roundworms, Amebiasis, 
Giardiasis, Cryptosporidium, Schistosoma, Pneumocystis carinii, Trichomoniasis and 
Trichinosis. 

In some cases the antigen may be an antigen from a prion. In particular, the 
antigen may be one from the causative agent of kuru, Creutzfeldt- Jakob disease 
(CJD), scrapie, transmissible mink encephalopathy and chronic wasting diseases, or 
from a prion associated with a spongiform encephalopathy, particularly BSE. The 
antigen may be from the prion responsible for familial fatal insomnia. 

In some cases the antigen may be from a parasitic pathogens including, for 
example, one from the genera Plasmodium, Chtamydia, Trypanosome, Giardia, 
Boophilus, Babesia, Entamoeba, Eimeria, Leishmania, Schistosome, Bi~ugia, Fascida, 
Dirofilaria, Wuchereria and Onchocerca. Examples of preferred antigens from 
parasitic pathogens to be expressed as the heterologous antigen include the 
circumsporozoite antigens of Plasmodium species, such as the circumsporozoite 
antigen of P. bergerii or the circumsporozoite antigen of P. falciparum; the merozoite 
surface antigen of Plasmodium species; the galactose specific lectin of Entamoeba 
histolytica; gp63 of Leishmania species; paramyosin of Bi~ugia malayi; the triose- 
phosphate isomerase of Schistosoma mansoni; the secreted globin-like protein of 
Trichostrongylus colubriformis\ the glutathione-S-transferases of Frasciola hepatica, 
Schistosoma bovis and SJaponicum\ and KLH of Schistosoma bovis and S. 
japonicum. 
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In some cases the antigen may be a cancer antigen and in particular a tumour 
antigen. Examples of particular cancers that the antigen may be derived include those 
from cancers of the lung, prostate, breast, colon, ovary, melanoma, a lymphoma and 
leukaemia. Examples of particular tumour antigens include MART-1, Melan-A, 
5 tyrosinase, P 97, beta-HCG,GaINAc, MAGE-1, MAGE-2, MAGE-4, MAGE- 12, 
MUC1, MUC2, MUC3, MQC4, MUC18, CEA, DDC, P1A, EpCam, melanoma 
antigen gp75, Hker 8, high molecular weight melanoma antigen,K19, Tyrl, tyr2, 
members of the pMel 17 gene family, c-Met, PSA (prostate antigen), PSM (prostate 
mucin antigen), PSMA (prostate specific membrane antigen), prostate secretary 

10 protein, alpha-fetoprotein, CA125, CA19.9, TAG-72, BRCA-1 and BRCA-2 antigens. 

The antigen may be an auto-antigen. In particular, the antigen may an antigen 
associated with an autoimmune disease. Auto-antigens include those associated with 
autoimmune diseases such as multiple sclerosis, insulin-dependent type 1 diabetes 
mellitus, systemic lupus erythematosus (SLE) and rheumatoid arthritis. The antigen 

15 may be one associated with, Sjorgrens syndrome, myotis, scleroderma or Raynaud's 
syndrome. Further examples of auto-immune disorders that the antigen may be 
associated with include ulcerative colitis, Crohns' disease, inflammatory bowel 
disorder, autoimmune liver disease, or autoimmune thyroiditis. Examples of specific 
autoantigens include insulin, glutamate decarboxylase 65 (GAD65), heat shock 

20 protein 60 (HSP60), myelin basic protein (MBP), myelin oligodendrocyte protein 

(MOG), proteolipid protein (PLP), and collagen type II. In cases where the antigen is 
an autoantigen the antigen will typically be administered in order to promote tolerance 
to the auto-antigen. Although in some cases models of the diseases may be produced 
using the invention to be produce an immune response. 

25 In some cases the antigen may be an allergen. The allergenic antigen may be 

any suitable antigen from an antigen. For example, the allergen may be from 
Ambrosia artemisiifolia, Ambrosia trifida, Artemisia vulgaris, Helianthus annuus, 
Mercurialis annua, Chenopodium album, Salsola kali, Parietaria judaica, Parietaria 
officinalis, Cynodon dactylon, Dactylis glomerata, Festuca pratensis, Holcus lanatus, 

30 Lolium peremie, Phalaris aquatica, Phleum pratense, Poa pratensis or Sorghum 
halepense. The allergen antigen may be from a tree, such as, for example, from 
Phoenix dactylifera, Betula ven-ucosa, Carpinus betulus, Castanea sativa, Corylus 
avellana, Quercus alba, Fraxinus excelsior, Ligusti-um vulgare, Olea europea, 
Syringa vulgaris, Plantago lanceolata, Oyptomeria japonica, Cupressus arizonica, 
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Juniperus oxycedrus, Junipeius virginiana, or Juniperus sabinoides. In some cases 
the antigen may be from an antigen from a mite such as, for example, from Acarus 
siro, Blomia tropicalis, Dermatophagoides farinae, Dennatophagoides microceras, 
Dermatophagoides pteronyssinus^ Euroglyphus maynei, Glycyphagus domesticus, 
5 Lepidoglyphus destimctor or Tyrophagus putrescentiae. 

The allergen antigen may be from an animal such as, for example, from a 
domestic or agricultural animal. Examples of allergens from animals include those 
from cattle, horses, dogs, cats and rodents (e.g from rat, mouse, hamster, or guinea 
pig). In some cases the antigen may be from a food allergen and in others it may be 
10 from insect. 

The antigen may be one involved in transplant rejection. The invention may be 
use to induce or promote tolerance to such an antigen in order to ameliorate or prevent 
transplant rejection. 

Homologues of antigens may also be employed. Protein antigens employed 
1 5 may have homology and/or sequence identity with naturally occurring antigens, such 
as any of the antigens mentioned herein. They may have any of the levels of sequence 
identity or sequence changes specified herein. 

The antigen may be a model antigen. The antigen may be one commonly used 
in experiments to assess immune responses. For example the antigen may be a 
20 lysozyme and in particular chicken egg lysozyme. The antigen may be ovalbumin and 
in particular chicken ovalbumin. Such model antigens may have the advantage that 
antigens, T cells and other reagents may be readily available for assessing antigen 
presentation by the targeted antigen presenting cell. 

25 Complexes 

The targeting polypeptides of the invention may be used to deliver a chosen 
antigen to antigen presenting cells (APCs). The targeting polypeptide and antigen are 
combined in the form of a complex. Thus a complex comprising a targeting 
30 polypeptide and an antigen is provided. In particular, the complex may comprise an 
antigen selected from a pathogenic antigen, auto-antigen, an allergen and a cancer 
antigen 

In some cases a nucleic acid molecule encoding the antigen may be delivered 
to antigen presenting cells using the targeting polypeptides of the invention. Thus the 
35 invention also provides a complex comprising a targeting polypeptide and a nucleic 
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acid molecule encoding an antigen. The nucleic acid molecule will be delivered to the 
antigen presenting cell and result in expression of the antigen in the antigen 
presenting cell. Thus reference herein to an antigen includes a nucleic acid molecule 
encoding such an antigen. 
5 A complex of the invention may comprise: 

- a targeting polypeptide; and 

an antigen and/or a nucleic acid molecule encoding an antigen. 
The targeting polypeptide and antigen may be joined together by any suitable 
means that ensures that the antigen is also targeted to the antigen presenting cell. 
10 Preferably, the targeting polypeptide and antigen may be present in the same 

polypeptide. Thus in some cases the antigen and targeting polypeptide may be directly 
fused to each other in a single polypeptide. In others the two may be present in the 
same polypeptide, but separated by an intervening sequence. For example, they may 
be separated by from 1 to 50, preferably from 5 to 25, more preferably from 10 to 20 
15 amino acids. 

In cases where the targeting polypeptide and fusion polypeptide are present in 
the same polypeptide, the targeting polypeptide may be separated by a sequence 
designed to be cleaved by a proteases in the antigen presenting cell in order to allow 
the two to be separated. In some cases where the targeting polypeptide and antigen 

20 are present in the same polypeptide, there may be a plurality of antigens in the 

polypeptide. For example, the same antigenic sequence may be repeated several times 
in the polypeptide such as from 1 to 50, preferably from 2 to 25, and more preferably 
from 5 to 10 times. In some cases the polypeptide may therefore comprise repeats of 
the same epitope or a group of epitopes from a particular antigen. In other cases 

25 different antigens may be present in the polypeptide. For example, two, three, four, 
five or more of any of the antigens mentioned herein may be present in the same 
polypeptide. The antigens may be from the same source or different source, and may, 
for example, be from different organisms. 

The targeting polypeptide and antigen may not be part of the same 

30 polypeptide. For example, they may be joined together by other covalent means. The 
targeting polypeptide and antigen may be joined together by a covalent bond, such as 
a covalent bond between side chains, for example by disulphide bridges. The two may 
be joined by a linker or other bridging molecule. In some cases the targeting 
polypeptide and antigen may be provided or coated on a moiety and the complex 
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including the targeting polypeptide, antigen and moiety may be targeted to antigen 
presenting cells by virtue of the presence of the targeting polypeptide. 

The complex may comprise a plurality of antigens and/or targeting 
polypeptides. For example, a plurality of antigens may be present and may be linked 
to a single targeting polypeptide or alternatively multiple antigens may be present in a 
polypeptide with a single targeting sequence. The complex, may for example 
comprise one, two, three, five, ten or more antigens. The complex may comprise 
antigens from different sources such as antigens from different organisms. The 
complex may comprise antigens from different strains of the same organism, such as 
from different strains of the same pathogen. In some cases the complex may 
comprise different allelic or mutant forms of the same antigen. For example, the 
antigens may be different forms of an antigen that display diversity that leads to 
strains of that pathogen with differing pathogenicity. The complex may comprise a 
plurality of copies of a single epitope where the epitope sequence is either present in 
the same polypeptide as the targeting polypeptide or is joined to it by one of the other 
means discussed herein. 

The complex may comprise a plurality of targeting polypeptides. For example, 
the complex may comprise 1, 2, 3 or more targeting polypeptides, such as for example 
from 5 to 50, more preferably from 10 to 25 or even more preferably from 15 to 20 
targeting polypeptides. The ratio of targeting polypeptides to antigen and/or epitope 
may be for example 1 : 1 , 1 :2, 1 :5, 1 : 1 0 or 1 :25, in some cases the ratio may be from 
1 : 1 to 1 :75, preferably from 1 :2 to 1 : 50 and more preferably from 1 :5 to 1 : 25. The 
ratio of targeting sequences to antigen and/or epitope may have such ratios. 

In a particularly preferred embodiment of the invention the targeting 
polypeptides in the complex may be present in a dimeric form. This may be the case, 
for example, where the targeting sequence is, or is a fragment thereof or a variant of 
either of SSL7 and/or 7 and preferably of SSL 7. In a preferred case, where a 
fragment or variant is employed as well as having targeting activity it may be able to 
form a dimer. 

The complex may be chemically modified, e.g. one or more, or indeed all, of 
the polypeptide types in the complex may be post-translationally modified. For 
example, they may be glycosylated or comprise modified amino acid residues. The 
polypeptides in the complexes and targeting polypeptides of the invention may 
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comprise amino acid analogs. In some cases, one or more peptides in the complex 
may have been generated synthetically. 

In some cases libraries of different complexes maybe generated. The libraries 
may comprise complexes with different antigens, for example, from different 
5 pathogens. In some cases, the library may comprise complexes with antigens from the 
same source, such as from the same organism including any of those mentioned 
herein. The libraries may be encoded by libraries of nucleic acids and/or vectors of 
the invention. Libraries may be generated and then screened to identify those 
complexes showing advantageous properties. 

10 In some cases the complex may comprise a nucleic acid encoding the antigen 

rather than the antigen itself. In particular, the complex may comprise a nucleic acid 
molecule capable of expressing the antigen. Any of the sequences discussed herein 
for expressing polypeptides may be employed to express the antgen. 

The invention therefore provides a viral particle which comprises a targeting 

15 polypeptide of the invention. In a particularly preferred instance the complex may 
comprise a viral particle which comprises: 

(i) a targeting polypeptide; 

(ii) a nucleic acid molecule encoding an antigen. 

The targeting polypeptide will preferably be wholly or partially exposed on the 
20 surface of the viral particle to allow the virus to be targeted to an antigen presenting 
cell. 

The complex may comprise the nucleic acid in any suitable manner to allow 
expression of the antigen in the antigen presenting cell. The complex may comprise a 
liposome with the targeting polypeptide present wholly or partially on the surface of 
25 the liposome to allow targeting to antigen presenting cells. 

Nucleic acids 

The present invention also provides a nucleic acid molecule comprising a 
polynucleotide sequence encoding a targeting polypeptide and an antigen and in 
30 particular an antigen selected from a pathogenic antigen, auto-antigen, an allergen and 
a cancer antigen. 

The targeting polypeptide and antigen may be encoded by separate open 
reading frames (ORFs) or alternatively the nucleic acid may comprise an open reading 
frame encoding both the polypeptide and the antigen. In a preferred case, the 
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nucleotide sequence encoding the targeting polypeptide and the antigen are present in 
a single open reading frame. 

In a particularly preferred embodiment the nucleic acid will be able to express 
the targeting polypeptide and antigen in the form of a polypeptide comprising both. 
As discussed elsewhere herein the targeting polypeptide and antigen may be directly 
fused or alternatively may be separated by intervening sequences which are also 
encoded by the nucleic acid. 

The nucleic acid sequence may therefore encode any of the targeting 
polypeptides referred to herein. Thus the nucleic acid may comprise a sequence that 
encodes any of the SSLs, fragments thereof, or variants of either discussed herein. 
The nucleic acids may also encode any of the additional polypeptide sequences 
referred to herein. In a preferred case the nucleic acid molecule may comprise one or 
more of the polynucleotides sequences of SEQ ID Nos l s 12, 16, 37, 49, 53, 63, 67, 
79, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101 and/or 107. In particular, the nucleic acid 
may comprise one or more of the regions of SEQ ID Nos 1, 12, 16, 37, 49, 53, 63, 67, 
79, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101 and/or 107 indicated herein to represent the 
CDS of a particular SSL. Thus any of the nucleic acid sequences provided herein 
which encode one or more of SSL1 to SSL14 may be used. The sequences used from 
the sequences provided herein may be restricted to the coding regions indicated or 
may also employ other regions from the gene and in particular the whole gene. 

The nucleic acid may comprise the polynucleotide sequence of one or more of 
the coding sequences provided herein for SSL 1 to SSL14, a fragment of one or more 
of those sequence or a variant sequence of such a sequence or fragment, where the 
sequence encodes a targeting sequence able to target the encoded polypeptide to an 
antigen presenting cell. 

The sequence of an SSL may be modified by nucleotide substitutions, 
insertions or deletions. In particular, the sequences provided herein which encode a 
SSL may be altered in such a way. The nucleic acid sequence may, for example, 
comprise from 1, 2, 5, 10 or 20 such substitutions, insertions and/or deletions as long 
as the encoded polypeptide has targeting activity. The variant sequence may comprise 
from 1 to 50, preferably from 5 to 25, more preferably from 10 to 15 amino acid 
insertions, deletions or substitutions as long as the encoded polypeptide displays 
targeting activity. Degenerate substitutions may be made and/or substitutions may be 
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made which would result in a conservative amino acid substitution when the modified 
sequence is translated, for example as shown in the Table above. 

The nucleic acid may comprise a sequence that has at least 25%, preferably at 
least 30%, more preferably at least 35% and even more preferably at least 40% 
sequence identity to any one or more of SEQ ID Nos 1, 12, 16, 37, 49, 53, 63, 67, 79, 
83, 85, 87, 89, 91, 93, 95, 97, 99, 101 and/or 107. In particular, they may have such a 
level of sequence identity to the region encoding the SSL and/or over the whole gene. 

In some cases the nucleic acid may have at least 50%, preferably at least 60%, 
more preferably at least 70%, even more preferably at least 80% sequence identity to 
such sequences. In other cases the level of sequence identity may be higher, because, 
for example, the sequence is a natural allelic variant or an engineered variant. Thus in 
some instances the sequence may have at least 85%, preferably at least 90%, more 
preferably at least 95%, even more preferably at least 97% and still more preferably at 
least 99% sequence identity to any of the sequences provided herein which encode 
SSL1 to SSL 14. In one case, the nucleic acid molecule will have one of the specified 
levels of nucleotide sequence identity to all of, three of, or two of the sequences 
encoding SSL5, SSL7 and/or SSL9 provided herein. 

The levels of sequence identity specified may be over the entire sequence 
encoding the targeting polypeptide or targeting sequence. They may, for example, be 
over from 25 to 900 nucleotides, preferably over 50 to 700 nucleotides, more 
preferably over 75 to 350 nucleotides and even more preferably over 100 to 250 
nucleotides. Thus in some cases the level of sequence identity specified may be over 
a region of at least 50, preferably at least 75, for instance at least 100, at least 150, 
more preferably at least 200 contiguous nucleotides or most preferably over the full 
length of the nucleic acid encoding the targeting sequence or polypeptide. 

The nucleic acid may comprise the nucleotide sequence of one or more of 
the sequence SSL5, SSL7 and/or SSL9 genes whose sequence is provided herein, a 
fragment of any of the sequences or a variant of the preceding sequences where the 
encoded polypeptide displays targeting activity. In a preferred instance, the nucleic 
acid will comprise the nucleotide sequence of one or more of the sequences provided 
herein which encode SSL5, SSL7 and/or SSL9. In a preferred case the nucleic acid 
may comprise the nucleotide sequence of SEQ ID Nos: SSL7 and/or SSL9, a 
fragment of either or a variant of any such sequence, where the fragment or variant 
retains targeting ability. In a particularly preferred case the nucleic acid will comprise 
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the polynucleotide sequence of a sequence provided herein encoding SSL7 and/or 
SSL9. 

Sequence identity and comparisons may be performed in a number of ways. 
For example the UWGCG Package provides the BESTFIT program which can be 
5 used to calculate homology (for example used on its default settings) (Devereux et al 
(1984) Nucleic Acids Research 12, p387-395). The PILEUP and BLAST algorithms 
can be used to calculate homology or line up sequences (typically on their default 
settings), for example as described in Altschul (1993) J. Mol. EvoL 36:290-300; 
Altschul et al (1990) J. Mol. Biol. 215:403-10. 

10 Software for performing BLAST analyses is publicly available through the 

National Centre for Biotechnology Information (http://ww\v.ncbi.nlm.nih. gov/V This 
algorithm involves first identifying high scoring sequence pair (HSPs) by identifying 
short words of length W in the query sequence that either match or satisfy some 
positive-valued threshold score T when aligned with a word of the same length in a 

1 5 database sequence. T is referred to as the neighbourhood word score threshold 
(Altschul et al, 1990). These initial neighbourhood word hits act as seeds for 
initiating searches to find HSPs containing them. The word hits are extended in both 
directions along each sequence for as far as the cumulative alignment score can be 
increased. Extensions for the word hits in each direction are halted when: the 

20 cumulative alignment score falls off by the quantity X from its maximum achieved 
value; the cumulative score goes to zero or below, due to the accumulation of one or 
more negative-scoring residue alignments; or the end of either sequence is reached. 
The BLAST algorithm parameters W, T and X determine the sensitivity and speed of 
the alignment. The BLAST program uses as defaults a word length (W) of 1 1, the 

25 BLOSUM62 scoring matrix (see Henikoff and Henikoff (1992) Proc. Natl Acad, Set 
USA 89: 10915-10919) alignments (B) of 50, expectation (E) of 10, M=5, N-4, and a 
comparison of both strands. 

The BLAST algorithm performs a statistical analysis of the similarity between 
two sequences; see e.g., Karlin and Altschul (1993) Proc. Natl Acad. Set. USA 90: 

30 5873-5787. One measure of similarity provided by the BLAST algorithm is the 

smallest sum probability (P(N)), which provides an indication of the probability by 
which a match between two nucleotide or amino acid sequences would occur by 
chance. For example, a sequence is considered similar to another sequence if the 
smallest sum probability in comparison of the first sequence to the second sequence is 
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less than about 1, preferably less than about 0.1, more preferably less than about 0.01, 
and most preferably less than about 0.001. 

Any combination of the above mentioned degrees of sequence identity and 
minimum sizes may be used to define polynucleotides of the invention, with the more 
5 stringent combinations (i.e. higher sequence identity over longer lengths) being 
preferred. Thus, for example, a polynucleotide which has at least 90% sequence 
identity over 25, preferably over 30 nucleotides forms one aspect of the invention, as 
does a polynucleotide which has at least 95% sequence identity over 40 nucleotides. 
The nucleic acids of the invention may comprise a number of another 

10 sequences in addition to that encoding the targeting polypeptide and antigen. For 

example, the nucleic acid may comprise sequences involved in the expression of the 
polypeptide it encodes such as those discussed below on the section in vectors. The 
nucleic acids may comprise primer sites, restriction sites, multiple cloning sites and 
other sequences to facilitate manipulation. The nucleic acid may comprise enhancer 

15 sequences to faciliate gene expression. The nucleic acid may comprise sequences 
allowing for the secretion of the encoded polypeptide or its targeting to a particular 
cellular compartment. In some cases the nucleic acids may also comprise a reporter 
gene or sequences, in other cases they may not. 

The nucleic acids of the invention may be used in the production of targeting 

20 polypeptides, antigens and/or complexes of the invention. Such production may take 
place in vitro, in vivo or ex vivo. The polynucleotides may be used in recombinant 
protein synthesis or indeed as therapeutic agents in their own right. Polynucleotides 
encoding a targeting polypeptide and not an antigen or alternatively those encoding an 
antigen, but not a targeting polypeptide may be used to produce targeting polypeptide 

25 and/or antigen which can then be utilised to form complexes of the invention. 

Vectors 

The present invention provides vectors comprising a nucleic acid of the 
invention. Thus in one instance the invention provides a vector comprising a nucleic 
30 acid sequence that encode a targeting polypeptide of the invention and an antigen. 

The antigen may be one selected from any of those mentioned herein and in particular 
may be selected from a pathogenic antigen, auto-antigen, an allergen and a cancer 
antigen. The vector may comprise any of the nucleic acid sequences mentioned 
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herein. In a preferred instance, the vector may encode the targeting polypeptide and 
antigen via the same open reading frame (ORF). 

The invention also provides vectors which comprise a nucleic acid which 
encodes the chosen antigen where the nuleic acid will be targeted to an antigen 
5 presenting cell using a targeting polypeptide of the invention. Such vectors will 
preferably be viral vectors. 

The vector may, for example, be a cloning, expression and/or viral vector. The 
vector may, for instance, be a plasmid vector. The vector may be a viral vector. The 
vector may be a shuttle vector. The vector may comprise a selectable marker, for 

1 0 instance an antibiotic resistance selectable marker. 

In particular, expression vectors are provided which are capable of expressing 
a targeting polypeptide of the invention and an antigen. In a preferred instance, the 
vector will express a targeting polypeptide of the invention as a fusion protein with 
the chosen antigen. Thus the vector may be capable of expressing a complex of the 

1 5 invention. Alternatively the targeting polypeptide and antigen may be produced 

separately and then linked to form a complex or the targeting polypeptide and antigen 
may be expressed as a single polypeptide and then further processing is carried out to 
produce a particular complex. For example, individual polypeptides may be linked 
together or sequences may be cleaved from the expressed polypeptides. In some case 

20 sequences used in purification may be removed by cleavage. 

Expression vectors are routinely constructed in the art of molecular biology 
and may for example involve the use of plasmid DNA and appropriate initiators, 
promoters, enhancers and other elements, such as for example polyadenylation signals 
which may be necessary, and which are positioned in the correct orientation, in order 

25 to allow for protein expression. Other suitable vectors would be apparent to persons 
skilled in the art. By way of further example in this regard we refer to Sambrook et 
aL 9 1989. The expression vector may be a prokaryotic or a eukaryotic expression 
vector. In some cases the expression vector maybe used to produce the encoded 
protein in vitro. In other cases, the expression vector will be intended to generate in 

30 vivo expression of the encoded protein and may be used, for example, in a method of 
therapy. 

Once coding sequences for desired proteins have been prepared or isolated, 
such sequences can be cloned into any suitable vector or replicon. Numerous cloning 
vectors are known to those of skill in the art, and the selection of an appropriate 
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cloning vector is a matter of choice. Ligations to other sequences may be performed 
using standard procedures, known in the art. The vector may be, for example, 
plasmid, virus or phage vectors provided with a origin of replication, optionally a 
promoter for the expression of the polynucleotide and optionally a regulator of the 
5 promoter. The vectors may contain one or more selectable marker genes. 

Expression of the targeting polypeptide and/or antigen will typically be driven 
by a promoter. The promoter will usually be chosen on the basis of the cell the 
expression vector is to be used in. Thus for prokaryotic expression a prokaryotic 
promoter will typically be used, whilst for eukaryotic expression a eukaryotic or viral 
1 0 promoter will typically be employed. The promoter employed may be a viral or non- 
viral promoter. The promoter may be a mammalian promoter, such as a cell or tissue 
specific promoter or alternatively a promoter expressed in a wide range of cells. 
Other types of regulatory elements may also be present in the vector, for example, 
enhancer sequences. 

15 Mammalian promoters, such as p-actin promoters, may be used. Tissue- 

specific promoters are especially preferred. Viral promoters may also be used, tbx 
example the Moloney murine leukaemia virus long terminal repeat (MMLV LTR), the 
rous sarcoma virus (RS V) LTR promoter, the S V40 promoter, the human 
cytomegalovirus (CMV) IE promoter, adenovirus, HS V promoters (such as the HSV 

20 IE promoters), or HPV promoters, particularly the HPV upstream regulatory region 
(URR). Viral promoters are readily available in the art. 

In one preferred case the promoter employed may be one that is capable of 
being expressed in an antigen presenting cell. Thus a vector comprising a promoter 
capable of giving rise to expression of the antigen in an antigen presenting cell is 

25 provided. A promoter which is specific for antigen presenting cells may be 
employed. 

Additional sequences may also be present in the open reading frame encoding 
the targeting polypeptide as well as optionally those encoding the antigen. In some 
cases, sequences directing secretion or release from the cell of the targeting 
30 polypeptide may be included. Peptide sequences allowing purification of the targeting 
polypeptide may be included. Preferably sequences allowing cleavage may be 
included and, for example, may be used to release the targeting polypeptide from the 
sequence used to purify the expressed polypeptide. 
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Cells 

The invention provides cells comprising a polynucleotide or vector of the 
invention. Thus the invention provides for a cell comprising a polynucleotide and/or 
5 vector encoding a targeting polypeptide and an antigen. The targeting polypeptide 
and/or antigen may be any of those mentioned herein. Preferably, the cell will 
express the targeting polypeptide and antigen and in particular will express 
polypeptides comprising both. The cell may therefore be able to produce complexes 
of the invention and in one case may secrete them. 

10 The invention also provides for cells that can produce viruses of the invention 

which have a targeting polypeptide provided on their surface either wholly or partially 
to allow for targeting of the viral particle to an antigen presenting cell. The invention 
provides a helper cell line that is capable of expressing the targeting polypeptide in 
such a way that it is incorporated in a viral particle. Such cells may also comprise the 

1 5 nucleic acid to be incorporated into the viral particles. The invention also provides 
cells infected with a virus of the invention. 

Cells of the invention include transient, or preferably stable higher eukaryotic 
cell lines, such as mammalian cells or insect cells, produced using, for example, a 
baculovirus expression system, lower eukaryotic cells, such as yeast or prokaryotic 

20 cells such as bacterial cells. Particular examples of cells include mammalian cells 
such as HEK293T, CHO, HeLa and COS cells. The cells may be human cells. The 
cells may be, for instance, from any of the species and/or subjects mentioned herein. 
Preferably the cell line selected may be one which is not only stable, but also allows 
for normal post-translation modifications, particularly so that the antigen or epitope is 

25 in the form it would be naturally expected to be encountered as. The cell may, for 
example, allow normal glycosylation. 

A polypeptide of the invention may be expressed in cells of a transgenic non- 
human animal. A transgenic non-human animal expressing a polypeptide of the 
invention is included within the scope of the invention. Such an animal may for 

30 example be a rodent (e.g. a mouse or rat). Preferred polypeptides of the invention may 
also be expressed in Xenopus laevis oocytes. 

The sequences encoding the targeting polypeptide of the invention and/or the 
antigen may be introduced into a chosen cell by any suitable technique and may be 
generally referred to without limitation as "transformation". For eukaryotic cells, 
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suitable techniques may include calcium phosphate transfection, DEAE-Dextran, 
electroporation, liposome-mediated transfection and transduction using retrovirus or 
other virus; e.g. vaccinia or, for insect cells, baculovirus. For example, the calcium 
phosphate precipitation method of Graham and van der Eb, Virology 52:456-457 
5 (1978) can be employed. General aspects of mammalian cell host system 

transformations have been described in U.S, Patent No. 4,399,216. For various 
techniques for transforming mammalian cells, see Keown et al. 9 Methods in 
Enzymology, 185:527 537 (1990) and Mansour et al. 9 Nature 336:348-352 (1988). 
Nucleic acids and vectors of the invention may be introduced into target cells both in 
10 vitro and in vivo. In particular, viral based systems may be used to introduce the 
nucleic acids and/or vectors of the invention into cells, particularly in vivo. 

Antigen Presenting cells (APCs) 

The invention allows a chosen antigen to be delivered to an antigen presenting 
15 cell. The targeting polypeptide present in the complexes direct the complex to an 

antigen presenting cell. The complex is taken up by the antigen presenting cell. This 
means that the antigen may then be presented by the antigen presenting cell. 

The invention also provides for the delivery of a nucleic acid molecule 
encoding an antigen through the use of a targeting polypeptide of the invention. The 
20 targeting polypeptide will preferably be on the surface of a virus. Infection of the 
antigen presenting cell by the virus results in the production of antigen which may 
then be presented by the cell. 

The antigen presenting cell may be any suitable antigen presenting cell. In 
particular, it may be a professional antigen processing cell and will typically express 
25 MHC molecules. The antigen processing cell will typically express MHC II molecules 
and the complex will allow the chosen antigen, or peptides derived from it, to be 
presented via MHC II molecules. In some cases the antigen, or peptides derived from 
it, may be presented via MHC I. The antigen may be presented by both MHC I and EE, 
typically presentation will be via MHCII. Examples of MHC I and II molecules 
30 which may be involved in presentation include HLA-A2, HLA-B62, HLA-Bw62, 

HLA-B35, HLA-DRB1, HLA-DRB2, HLA-DRB3, HLA-DRB5, HLA-DRB7, HLA- 
A25, HLA-B8, HLA-B52, HLA-DQB1, HLA-A3, HLA-A1 1 or HLA-B27. 

Examples of antigen presenting cells that the invention may be used to target 
antigens to include dendritic cells, monocytes, and/or or a B cells. In particular, 
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monocytes aad/or dendritic cells may be targeted and in a preferred instance the 
antigen presenting cell is a dendritic cell. The B cells may typically be CD19 + B 
cells. The monocytes may, for example, be CD14 + cells and such cells may also be 
CD2 10 cells. 

5 In cases where the antigen presenting cell is a dendritic cell it may express cell 

surface markers known to be characteristic of dendritic ells. In particular the dendritic 
cell may express CD1 lc, CD209 (also known as DC sign) and/or CD13. The cell may 
be CD14 lG . In some cases the cell may express all of CD 11c, CD209 and CD 13 and 
may also be CD14 10 . 

10 In a preferred instance the antigen presenting cell may be a dendritic cell. Any 

type of dendritic cell may be targeted. Examples of dendritic cells include a monocyte 
derived dendritic cell, a plasmacytoid derived dendritic cell or an intersititial dendritic 
cell. The dendritic cell may be a langerhans cell. The dendritic cell may be one 
present in an organ or tissue. The dendritic cell may be one from, or present in, a 

15 mucosal surface. The dendritic cell may be an intestinal dendritic cell such as one 
obtained from Peyers patches. The dendritic cell may be one present in, or obtained, 
from an immune tissue such as from a secondary lymph node. The dendritic cell may 
be present in, or obtained from the spleen or a lymph node. 

The term antigen presenting cell covers any cell that can present an antigen 

20 targeted to it via a complex of the invention. In some cases antigen presenting cells 
may have different stages in their development during which they, for example, 
predominately take up antigen, rather than present it. For example, it is thought that 
dendritic cells may have immature stages characterised by the uptake of large 
amounts of potential antigens and more mature stages characterised by lower amounts 

25 of antigen uptake, but increased amounts of antigen presentation of the antigens they 
acquired earlier. The dendritic cell may, for example, be present in the periphery and 
effectively collect antigen and then move to areas such as secondary lymphoid tissues 
to present antigens. In one case the invention encompasses targeting of an antigen to 
an immature antigen presenting cell subsequently capable of presenting the antigen it 

30 has taken up, such as to an immature dendritic cell. 

In some cases it maybe desirable to target immature dendritic cells in 
instances where it is desired to induce tolerance. In particular, delivery of antigen by 
employing a targeting polypeptide of the invention in the absence of a stimuli which 
induces or promotes dendritic cell maturation can result in tolerance. To achieve 

34 



WO 2005/092918 



PCT/GB2005/001084 



tolerance the targeting polypeptide may preferably be used to deliver the chosen 
antigen or nucleic acid encoding the antigen in the absence of an adjuvant. An 
advantage of employing the targeting polypeptide of the invention to induce tolerance 
is that unlike many methods for inducing tolerance a large dose of antigen is not 
5 required. Steinman et al (2003) Ann. Rev. Immunol., 21 :685-71 1 discusses the 

induction of tolerance via dendritic cells and the methods and assays discussed therein 
may be employed when inducing tolerance using the methods of the invention. In 
situations where it is desired to induce tolerance peripheral dendritic cells may 
preferably be targeted. 

10 Conversely, in situations where it is desired to promote an immune response 

against the antigen, the targeting polypeptide may preferably be employed with an 
adjuvant and in particular one which induces or promotes stem cell maturation such as 
aluminium hydroxide. 

In some cases the antigen cells may be manipulated in vitro and this may 

15 allow control of whether the cells are exposed to stimuli which promote dendritic cell 
maturation. Thus by ensuring that the cells are not exposed to stimuli responsible for 
inducing maturation the resultant cells may be used to induce tolerance. In other cases 
the cells will be exposed to stimuli which promote dendritic cell maturation and hence 
the cells can be used to promote an immune response when they are transferred to a 

20 subject. 

Antigen presenting cells may be isolated from any suitable source including 
any of those mentioned herein. For example, such cells may be isolated from the 
white cells of the blood. Methods based on cell density such as LYMPHOPREP™ 
and centrifugation may be employed. The cells may be isolated using a variety of 
25 techniques including antibody based techniques. The cells may be isolated using 
negative and positive selection techniques based on surface markers which present 
and/or those that are not present on antigen presenting cells. In some cases, antigen 
presenting cells may be obtained by exposing other cells, such as precursor cells, to 
appropriate stimuli. 

30 Dendritic cells may be obtained by treating monocytes with appropriate 

stimuli such as GM-CSF and/or IL-4. For example, cells may be culture in the 
presence of 10 to 500 ng/ml, preferably from 25 to 200 ng/ml, more preferably from 
50 to 150 ng/ml and even more preferably in the presence of 100 ng/ml GM-CSF. Li 
particular, human and preferably recombinant human GMCSF may be employed. The 
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cells may be cultured in the presence of 10 to 250 ng/ml, preferably from 25 to 150 
ng/ml, more preferably from 40 to 70 ng/ml and even more preferably in the presence 
of 50 ng/ml IL-4. T and B lymphocytes may be removed by appropriate selection 
such as on the basis of the markers CD3, CD2 and/or CD 19. 

In other cases plasmacytoid cells may be induced to differentiate into dendritic 
cells by exposure to DL-3. In some instances immature antigen presenting cells may 
be induced to mature using appropriate stimuli. Such treatments will typically be 
performed in vitro. 

The antigen presenting cells may be treated ex vivo. Thus the cells may be 
recovered from a subject loaded with antigen using the methods of the invention and 
then used therapeutically. The invention provides loaded antigen presenting cells. 
The invention provides antigen presenting cells which have been infected by a viral 
particle of the invention. 

Assays 

The targeting polypeptides of the invention are utilised to target antigens to 
antigen presenting cells. This targeting ability and the downstream effects of targeting 
can be assessed in a number of ways. 

The ability of a polypeptide to target itself, and hence a complex, to an antigen 
presenting cell can be assessed using any suitable technique. In one case in vitro 
assays may be performed to monitor binding of the targeting polypeptide to an 
antigen presenting cell. The targeting polypeptide may be labelled and the binding to 
the antigen presenting cell followed. The targeting polypeptide may be labelled with a 
fluorochrome and its binding to an antigen presenting cell assessed using techniques 
such as flow cytometry. Binding at 4°C and 37 °C may be compared to help 
demonstrate that the binding is specific. 

The ability of the polypeptide under study to target to an antigen presenting 
cell may be compared with a polypeptide known to have targeting ability. For 
example, where the test targeting polypeptide is based on the sequence of a particular 
SSL the ability to compete against that SSL may be compared using two or more 
different labels. The ability of a targeting polypeptide to compete against varied 
concentrations of a polypeptide with known binding activity may be measured. In this 
way polypeptides with binding activity may be identified and the level of targeting 
ability that they display quantified. Such assays will typically be performed in vitro 



36 



WO 2005/092918 



PCT/GB2005/001084 



but the ability of antigen cells in vivo or ex vivo conditions to take up test polypeptides 
and/or complexes may also be measured. Again, labelling may be used to study 
delivery to antigen presenting cells. 

The ability of a targeting polypeptide to enter the antigen presentation 
5 pathway may also be assessed. The presence of the targeting polypeptide in a 
subcelluar compartment associated with antigen presentation may be measured. 
Labelling may be used to achieve this. Confocal microscopy can be used to confirm 
that the label, and hence the targeting polypeptide, has entered the cell. Stains for 
subcellular compartments associated with antigen presentation may be employed. For 

1 0 example, Texas red dextran staining may be employed to identify such compartments 
and co-staining may be used to confirm the presence of the targeting polypeptide in 
such compartments. In addition, the association of the complex, or part of it, in the 
same regions as MHC molecules, particularly MHC II may be examined. 

The ability of a particular targeting polypeptide to lead to antigen presentation 

1 5 may be studied. Thus the presence of peptide sequences from the antigen being 

presented on the cell surface by MHC may be studied. In particular, presentation by 
MHCII may be examined. Techniques may be used to elute peptides bound to MHC 
and identify those originating from the chosen antigen. The degree of peptide 
presented when the antigen is provided on its own and when it is provided as part of a 

20 complex may be compared. The ability of particular polypeptides to lead to antigen 
presentation may be compared including using control polypeptides with known 
targeting ability. Thus the ability of two polypeptides to lead to presentation of the 
same antigen may be compared. 

In some assays the downstream effects of antigen presentation may be 

25 measured. Thus assays may involve an antigen presentation cell being loaded using 
test polypeptides and then antigen presentation to a second cell may be measured. The 
ability of different targeting polypeptides and complexes to bring about the 
downstream effects of antigen presentation may be measured including controls with 
known activity. Such assays maybe performed in vitro and, for example, they may be 

30 performed using an antigen presenting cell and a T cell known to have a receptor 

specific for an epitope present in the antigen. The T cell may be any suitable T cell. It 
may be a CD4 + or CD8 + T cell and in particular may be a CD4 + T cell. The 
downstream effects of antigen presentation may also be measured in vivo 
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Downstream effects of antigen presentation include activation of the T cell. 
Various signal transduction effects associated with activation of the T cell may be 
measured. The activation may include the differentiation and/or proliferation of the T 
cell. Thus the number and proliferation of the T cells may be measured, using, for 
5 example, suitable labelling techniques or by measuring cell number. The expansion 
of particular subsets of T cells may be measured. For example, by flow cytometry. 
Assays involving autologous and/or allogenic T cells may, for instance be employed, 
including any of those mentioned herein. 

The activation of the T cells may lead to the release of cytokines. For example, 

10 the activation may lead to the release of interleukins (e.g. 11-2, IL-4 , EL-5, EL-6 

and/or IL-10) , IFNy and/or TNF-p. In some cases antigen presentation may lead to a 
Thl type response. Thus the cytokines released may be, or predominately be, IL-2, 
IFNy and/or TNF-p. In other cases the response may be a Th2 type response. The 
cytokines released may be, or predominately be, EL- 3, IL-4, EL-5, EL-6 and/or EL- 10. 

1 5 The T cells may release factors that stimulate their own proliferation such as EL-2. 

and/or EL-4. Particular complexes and substances of the invention may be designed to 
give particular responses such as any of those mentioned herein. The release of such 
factors can be studied both in vitro and in vivo using techniques such as ELISA to 
measure levels of such compounds. In some cases, the cytokine levels measured may 

20 be one or more, or indeed all, of IFNy* EL- 10 and EL- 13. 

The presence and/or number of effector T cells and memory T cells may be 
assessed. In addition various downstream effects may be measured. The number and 
activity of CD4 + and/or CD8 + T cells may be measured and in particular those specific 
for the chosen antigen. Antibody responses may be assessed in terms of the amount 

25 and types of antibody produced. In one preferred instance, the delivery of the chosen 
antigen using the targeting sequences of the invention leads to an antibody response 
against the chosen antigen. Any of the assays discussed herein may, for instance, be 
used to detect such an antibody response. In some cases tolerance may be induced and 
an antibody response may not be seen and in particular subsequent challenge with the 

30 antigen may result in a lower response or no response in comparison to the response 
seen without the initial induction of tolerance. 

The effects on other immune cells may be measured such as on macrophages 
and/or granulocytes. The skilled person will be aware of appropriate assays for 
assessing the immune response and immunogenicity. 
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In some embodiments of the invention a nucleic acid encoding an antigen, 
rather than the antigen itself, is targeted to the antigen presenting cell. In a particularly 
preferred embodiment the viral particles comprising targeting polypeptides may be 
employed in the invention. The ability of the viruses to selectively bind to antigen 
5 presenting cells may be measured. Non-antigen presenting cells may be employed as 
controls to assess the specificity of the viruses. 

Standard techniques may be used to monitor the expression of the nucleic 
acids targeted to antigen presenting cells. For example, test experiments may be done 
using reported genes in place of an antigen encoding gene. Techniques such as RT- 
1 0 PCR, Northern blotting, Western Blotting and cell staining may be used to monitor 
expression in antigen presenting cells. Presentation of the antigen may be evaluated. 

The above techniques may be used to identify effective targeting sequences. 
They may also be used to assess the efficacy of the invention in promoting an immune 
response against a chosen antigen. The techniques may also be used to assess the 
1 5 efficacy of the substances of the invention in the treatment or prevention of any of the 
conditions referred to herein. Typically suitable controls will be employed. In some 
cases standard test antigens and/or targeting polypeptides may be employed and 
compared to the substance under test. 

20 Loading antigen presenting cells 

The invention provides for the loading of antigen presenting cells. Thus the 
invention provides a method of loading an antigen presenting cell, comprising 
contacting an antigen presenting cell with a complex of the invention. The targeting 
polypeptide present in the complex directs the complex to the antigen presenting cells. 

25 The complex is taken up by the antigen presenting cell and the antigen is presented by 
the antigen presenting cell as discussed herein. The targeting of the complex to the 
antigen presenting cell is termed loading of the antigen presenting cell. 

The invention also encompasses the loading of antigen presenting cells using a 
targeting polypeptide of the invention to deliver a nucleic acid encoding an antigen to 

30 antigen presenting cells. In particular viral particles comprising the targeting 

polypeptide may be employed. The invention therefore provides a method of loading 
an antigen presenting cell comprising using a targeting polypeptide of the invention to 
deliver a nucleic acid molecule encoding an antigen to the antigen presenting cell. The 
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invention provides for the infection of an antigen presenting cell with a virus of the 
invention. 

The loading of antigen presenting cells or their precursors may be performed 
in vitro, ex vivo or in vivo. In the case of in vifro loading the antigen presenting cell 
5 may simply be contacted with a complex of the invention. The cell may be cultured in 
the presence of the complex under suitable conditions. The cell and complex may, for 
example, be contacted for between five minutes and ten days, preferably from an hour 
to five days, more preferably from five hours to two days and even more preferably 
from twelve hours to one day. Ex vivo loading may,for instance, be carried out in the 

10 same manner once the cells to be loaded have been obtained. 

Loading of antigen presenting cells or their precursors may be performed in 
vivo. Thus the invention provides an in vivo method of loading antigen presenting 
cells comprising administering to a subject and effective amount of a complex of the 
invention. Administration may be via any of the routes discussed herein. Loading in 

15 vivo may also be achieved by administering a nucleic acid, vector, cell, virus, vaccine 
or pharmaceutical composition of the invention. The administration of such products 
results in complexes of the invention coming into contact with antigen presenting 
cells. Thus the nucleic, vector, virus or cells may lead to the production of a complex 
of the invention which then loads an antigen presentation cell present in the subject. 

20 In one case antigen presenting cells recovered from a subject are loaded in 

vitro with a complex of the invention. The invention therefore provides an ex vivo 
method of loading antigen presenting cells. The loaded cells may then be returned to a 
subject and in particular to promote an immune response against the antigen that the 
cells have been loaded with. 

25 The invention also provides antigen presenting cells which have been loaded 

using the complexes of the invention. Such antigen presenting cells will typically 
comprise the complex or breakdown products of the complex and in particular the 
antigen to be presented. The loaded cells may comprise epitopes of the antigen. The 
cell may also comprise the targeting polypeptide and/or breakdown products thereof. 

30 The invention provides such cells in an isolated form. The loaded cells may comprise 
a nucleic acid of the invention and in particular may comprise a viral vector of the 
invention. Preferably, the viral vector will be replication deficient. 

Generally the antigen presenting cell of the invention carries peptides, and in 
particular an antigenic epitope, derived from the chosen antigen on its surface in 
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conjunction with an MHC class I or class II molecule and in particular in conjunction 
with an MHC II molecule. In one embodiment the antigen presenting cell has at least 
100, preferably at least 200, for example at least or about 500 or 1000, class I and/or 
class II molecules on its surface loaded with the product and in particular class II 

5 molecules. In some cases, the cells may carry a label or be labelled, such as, for 
example, with thymidine or radioactive chromium. The invention also provides T 
cells that have been activated by a loaded antigen presentation cell of the invention. 

In some cases antigen presenting cells may be recovered from a subject, 
loaded in vitro and then returned to the same subject. La other cases, it may be thul T 

10 cells are recovered from a subject, exposed to loaded antigen presenting cells of the 
invention in vifro and then returned to the subject. 

In one case the invention may provide a composition comprising T cells, 
antigen presenting cells and a complex of the invention. The T cells or antigen 
presenting cells may be any of the cells mentioned above. In particular, the antigen 

1 5 presenting cells may be dendritic cells. The T celhantigen presenting cell ratio may be 
typically from 500:1 to 1:500. Typically at least 10 3 , such as (e.g. at least or about) 
10 5 , 10 6 , 10 7 , 10 8 , 10 9 cells are present per millilitre of the composition. The 
composition typically also comprises a culture medium capable of supporting the T 
cells or antigen presenting cells, such as RPMI medium. The medium may also 

20 comprise cytokines, such as DL-2, IL-4, IL-7 or TNF-a. The T cells and antigen 
presenting cells may be from the same individual. 

The cells employed in the invention, particular the antigen presenting cells 
and/or T cells, may be autologous cells, or cells which have been partially or fully 
matched with the subject for MHC class I HLA-A or HLA-B; and/or for MHC class II 

25 type. In a preferred case, the cells employed in the invention may be recovered from 
a subject and utilised ex vivo and subsequently returned to the same subject. 

Delivery of nucleic acid molecules encoding antigens to antigen presenting cells 

The targeting polypeptides of the invention may be used to deliver a nucleic 
30 acid molecule which encodes an antigen to an antigen presenting cell. The nucleic 

acid molecule gives rise to expression of the antigen in the antigen presenting cell and 
to the subsequent presentation of the antigen by the cell. 
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The nucleic acid molecule will preferably comprise a promoter which is 
operably linked to the sequences encoding the antigen and which is active in antigen 
presenting cells or which can be induced in antigen presenting cells. 

In a particularly preferred embodiment the nucleic acid encoding the antigen 
5 may be delivered to the antigen presenting cell via a viral particle which comprises a 
targeting polypeptide of the invention. The targeting polypeptide will typically be 
provided wholly or partly on the surface of the virus in order for the polypeptide to be 
able to target the virus to an antigen presenting cell. 

Any suitable virus may be used in such embodiments. The virus may, for 
10 example, be a retrovirus, a lentivirus, an adenovirus, an adeno-associated virus , a 
vaccinia virus or a herpes simplex virus. In a particularly preferred embodiment the 
virus may be a lentivirus. The lentivirus may be a modified HIV virus suitable for use 
in delivering genes. The lentivirus may be a SIV, FIV, or equine infectious anemia 
virus (EQIA) based vector. The virus may be a moloney murine leukaemia virus 
15 (MMLV). 

The targeting polypeptide may comprise sequences from the virus. For 
example, the targeting polypeptide may comprise sequences from a viral surface 
protein. In particular, the targeting sequences may be at the N terminus of the 
targeting polypeptide and be fused or linked to surface protein sequences. The 

20 targeting polypeptide may also comprise a transmembrane domain so that is can be 
provided in the viral membrane. 

In a particularly preferred embodiment the targeting polypeptide may include 
sequences from a surface protein of the virus. In particular the targeting polypeptide 
may comprise sequences from a surface protein that is involved in the normal binding 

25 of the virus to its target cell. The binding of the targeting polypeptide to the antigen 
presenting cell may lead to conformational changes allowing the viral surface protein 
sequences to bind to their target on the cell surface. This may lead to fusion of the 
viral and cell membranes allowing entry of the virus into the antigen presenting cell. 
Thus the targeting sequences and surface protein domains may show receptor 

30 cooperation to facilitate the entry of the virus specifically into the antigen presenting 
cells. The targeting polypeptide may include linker sequences which facilitate the co- 
operation and in particular proline rich sequences present in the viral surface protein 
may be employed. The proline rich linkers discussed in Martin et al (2003) Journal of 
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Virology 77(4): 2753-2756 and Valsesia-Wittmann et al (1997) EMBO Journal, 
16(6): 1214-1223 maybe employed. 

The invention provides a virus comprising a targeting polypeptide as well as a 
cell infected by such a virus. The virus will typically also comprise a nucleic acid 
5 molecule encoding the chosen antigen. The virus may, for instance, be any of those 
mentioned herein. The nucleic acid molecule may also encode other sequences, for 
example the nucleic acid sequences may comprise sequences which express proteins 
which boost the immune response to the antigen. The nucleic acid may encode a 
cytokine, including any of those mentioned herein and in particular IL-1, IL2 and/or 

10 BL12. The nucleic acid may also encode a costimulatory molecule such as a surface 
polypeptide which enhances the immune response. The nucleic acid may encode, for 
example, CD80 and/or CD86. 

The viruses of the invention are preferably replication deficient. In some cases 
the nucleic acid sequences encoding the targeting polypeptide will not be included in 

1 5 the viral vector. Thus the invention also provides helper cells which express the 

targeting polypeptide in such a way that the targeting polypeptide is incorporated into 
the viral particles. The invention also provides nucleic acid vectors that encode a 
targeting polypeptide of the invention which comprises viral sequences. 

20 Medicaments, methods and therapeutic use 

The complexes of the invention and various related aspects of the invention 
may be used be used in a method of therapy of the human or animal body. Thus the 
invention provides for the use of a targeting polypeptide, a complex, a nucleic acid, a 
vector, a cell, a virus, or an antigen presenting cell of the invention in a method of 
25 treatment of the human or animal body by therapy. 

The invention provides for the use of a complex of the invention, a nucleic 
acid encoding a targeting polypeptide and antigen of a complex of the invention, a 
vector comprising such a nucleic acid, a cell comprising such a nucleic acid or vector, 
a virus of the invention or an antigen presenting cell of the invention in the 
30 manufacture of a medicament for use in immunisation. 

In a preferred case the invention provides for the use of a complex comprising: 
(a) a targeting polypeptide comprising a staphylococcal superantigen-like 
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protein (SSL), a fragment thereof or a variant of either, where the SSL, 
fragment or variant has the ability to target the complex to an antigen 
presenting cell; and 
(b) an antigen or a nucleic acid encoding an antigen, 
5 in the manufacture of a medicament for use in immunisation. 

In some instances the antigen comprises a polypeptide which is present in the 
complex as a fusion polypeptide with the targeting polypeptide. In others the antigen 
and targeting polypeptides are not part of the same polypeptide, but are covalently 
joined to each other or are joined through a linker. In a preferred case the antigen is a 
10 pathogenic antigen, an auto-antigen, an allergen and/or a cancer antigen. In another, 
the targeting polypeptide is present as a dimer. 

Immunisation may result in promoting an immune response against the chosen 
antigen. Any of the effects resulting from targeting antigen presenting cells 
mentioned herein may be promoted or achieved. In particular the level of presentation 
15 of the chosen antigen will be increased. An increase in presentation via MHC I and/or 
MHC II molecules and in particular via MHC II molecules may be seen. In a 
preferred case the level of antigen presentation achieved may be such that when the 
same antigen is encountered again an increased immune response is seen in 
comparison to if the initial immunisation had not taken case. In particular a 
20 therapeutic and/or protective immune response may be raised. The invention may 
therefore ensure that a higher level of immune response is seen when the antigen is 
next encountered. 

The invention may be used to enhance the level of antigen presentation or of 
any of the downstream effects thereof, such as any of those mentioned herein, in 

25 comparison to administration of an equivalent amount of antigen in the absence of a 
targeting polypeptide. The increase may be double, treble, or more fold, in some 
cases it may be at least ten-fold, preferably at least twenty-fold and even more 
preferably at least 100 fold, or 1000 fold or more. It may be that a protective, or 
therapeutic response, is seen whereas in the absence of the use of a targeting 

30 polypeptide it is not. 

The invention also provides for an agent for immunising a subject, the agent 
comprising a complex of the invention, a nucleic acid encoding a targeting 
polypeptide and antigen of a complex of the invention, a vector comprising such a 
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nucleic acid, a cell comprising such a nucleic acid or vector, or an antigen presenting 
cell of the invention. 

The various diseases and conditions to be prevented or treated may be any of 
those mentioned herein or associated with an antigen mentioned herein. In particular, 
5 the disease may be one associated with a pathogen, such as a bacterium, virus, 

bacterium, parasite, protozoan, fungus, and/or prion. In some cases the disease may be 
a cancer such as any of those mentioned herein. 

In some instances the invention will be used to induce tolerance to a particular 
antigen and in particular to an allergen or an auto-antigen In such cases typically the 

10 method will involve the delivery of the desired antigen to an antigen presenting cell in 
the absence of a stimulus which promotes antigen presenting cell maturation. In 
particular, the antigen may be delivered in the absence of an adjuvant such as 
aluminium hydroxide. The immunisation methods and vaccines of the invention may 
be used to induce tolerane to a selected antigen. 

1 5 The substances of the invention may be administered to any suitable subject. 

The subject on which the method of the invention is performed is generally a 
vertebrate subject. By "vertebrate subject" is meant any member of the subphylum 
cordata, particularly mammals, including, without limitation, humans and other 
primates, as well as rodents, such as mice and rats. The subject may be a non-human 

20 animal. The non-human animal may be a domestic animal or an agriculturally 

important animal. The animal may be a domestic pet. The animal may be a monkey 
such as a non-human primate. The term subject does not denote a particular age. 
Thus, both adult and newborn individuals are intended to be covered, hi one 
embodiment the subject is susceptible to or at risk from the relevant disease. For 

25 example, the subject may have been exposed, or will be in a region where there is a 
risk of exposure, to a particular antigen and in particular a pathogen. 

The invention also covers the use of the complexes of the invention to 
promote antigen presentation of a chosen antigen. Li some cases the invention may be 
used to bring about antigen presentation in an animal model, for example to study 

30 whether or not a particular immune response can be raised. The efficacy of the 
invention may be assessed in such non-human animal models. In some cases the 
invention may be used to help generate an immune response in a non-human animal 
in order to obtain antibodies against a chosen antigen that can then be recovered. Thus 
the invention may be used in a method of antibody production. 
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The invention may be used in combination with other means of, and 
substances for, immunisation, in some cases the complexes of the invention may be 
administered simultaneously, sequentially or separately with antigen which is not part 
of a complex of the invention. Thus complexes may be administered with the same 
5 antigen in a form not linked to a targeting polypeptide. The substances of the 

invention may be used in combination with existing vaccines for a particular antigen 
and may, for example, be simply mixed with such vaccines. Thus the invention may 
be used to increase the efficacy of existing vaccines including, for example, peptide, 
polypeptide, nucleic acid, viral and/or bacterial based antigens. 

10 

Pharmaceutical compositions, vaccines and administration 

The invention additionally provides pharmaceutical compositions comprising 
a complex, nucleic acid, vector and/or cell of the invention and a pharmaceutically 
acceptable carrier or diluent. The present invention also provides a vaccine 

15 composition comprising a complex, nucleic acid, vector and/or cell of the invention. 
The vaccines and compositions may comprise any of the substances mentioned herein 
and in particular the complexes, nucleic acid molecules, vectors, viruses and cells of 
the invention. The invention provides a method of vaccination comprising 
administering to a subject an effective amount of a vaccine composition of the 

20 invention. 

The various compositions, vaccines and other substances of the invention may 
be formulated using any suitable method. Formulation with standard 
pharmaceutically acceptable carriers and/or excipients maybe carried out using 
routine methods in the pharmaceutical art. For example, an active substance may be 

25 dissolved in physiological saline or water for injections. The exact nature of a 

formulation will depend upon several factors including the particular substance to be 
administered and the desired route of administration. Suitable types of formulation 
are fully described in Remington's Pharmaceutical Sciences, 19 th Edition, Mack 
Publishing Company, Eastern Pennsylvania, USA, the disclosure of which is included 

30 herein of its entirety by way of reference. 

The substances may be administered by enteral or parenteral routes such as via 
oral, buccal, anal, pulmonary, intravenous, intra-arterial, intramuscular, 
intraperitoneal, topical or other appropriate administration routes. The substances may 
in some cases be administered to sites characterised by the presence of antigen 
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presenting cells. In cases where loaded antigen presenting cells are administered they 
may be administered, for example, to sites of antigen presentation such as secondary 
lymph nodes. 

Vaccines may be prepared from one or more of the complexes, nucleic acids, 
5 vectors, and/ or cells of the invention together with a physiologically acceptable 

carrier or diluent. Typically, such vaccines are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid 
prior to injection may also be prepared. The preparation may also be emulsified, or 
the encapsulated in a liposome, particularly in the case of nucleic acids and vectors of 
10 the invention. The active ingredient may be mixed with an excipient which is 
pharmaceutically acceptable and compatible with the active ingredient. Suitable 
excipients are, for example, water, saline, dextrose, glycerol, ethanol, of the like and 
combinations thereof. 

In addition, if desired, the vaccine and/or pharmaceutical compositions of the 
1 5 invention may contain minor amounts of auxiliary substances such as wetting or 
emulsifying agents, pH buffering agents, and/or adjuvants which enhance 
effectiveness. 

The complexes of the invention enhance the immunogenicity of a chosen 
antigen. They may therefore act as adjuvants for a chosen antigen and may be used as 

20 adjuvants. In some cases other adjuvants may be present in the various formulations 
of the invention or be administered simultaneously, separately or sequentially with 
them. Suitable adjuvants include, for example, any substance that enhances the 
immune response of the subject to the antigen (including when delivered by the 
polynucleotide of the invention). They may enhance the immune response by 

25 affecting any number of pathways, for example, by stabilizing the antigen/MHC 

complex, by causing more antigen/MHC complex to be present on the cell surface, by 
enhancing maturation of APCs, or by prolonging the life of APCs (e. g., inhibiting 
apoptosis). 

Examples of adjuvants that may be employed include cytokines. Certain 
30 cytokines, for example TRANCE, flt-3L, and CD40L, enhance the 

immxmostimulatory capacity of antigen presenting cells and may be employed. Non- 
limiting examples of cytokines which may be used alone or in combination include, 
interleukin-2 (EL-2), stem cell factor (SCF), interleukin 3 (IL-3), interleukin 6 (EL-6), 
interleukin 12 (DL-12), G-CSF, granulocyte macrophage-colony stimulating factor 
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(GM-CSF), interleukin-1 alpha (IL-1 a), interleukin-11 (EL- 11), MlP-la, leukemia 
inhibitory factor (LIF), c-kit ligand, thrombopoietin (TPO), CD40 ligand (CD40L), 
tumor necrosis factor-related activation-induced cytokine (TRANCE) and flt3 ligand 
(flt-3L). Further examples of adjuvants which may be effective include but are not 
5 limited to: aluminium hydroxide, N-acetyl-muramyl-I^threonyl-D-isoglutamine (thr- 
MDP), N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 1 1637, referred to as 
nor-MDP), N-acetyhnuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(l '-2'- 
dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 1983 5 A, referred 
to as MTP-PE), and RIBI, which contains three components extracted from bacteria, 

10 monophosphoryl lipid A, trehalose dimycolate and cell wall skeleton 
(MPL+TDM+CWS) in a 2% squalene/Tween 80 emulsion. 

In cases where the invention is used to target a nucleic acid which encodes an 
antigen to an antigen presenting cell, the nucleic acid may also encode molecules 
capable of acting as an adjuvant. Thus the nucleic acid may lead to the production of 

1 5 any of the adjuvants mentioned herein and in particular a cytokine or costimulatory 
molecule. The cytokine may, for example be, IL-1, IL2, and/or IL-1 2 which will 
preferably be secreted from the antigen presenting cell. The costimulatory molecule 
may, for example, be CD80 or CD86 which will be preferably expressed on the cell 
surface of the antigen presenting cell. 

20 The substances of the invention, and in particular, the vaccines, are typically 

administered parentally, by injection, for example, either subcutaneously or 
intramuscularly. Additional possible formulations include suppositories, oral 
formulations and formulations for transdermal administration. For suppositories, 
traditional binders and carriers may include, for example, polyalkylene glycols or 

25 triglycerides; such suppositories may be formed from mixtures containing the active 
ingredient in the range of 0.5% to 10%, preferably 1% to 2%. Oral formulations 
include such normally employed excipients as, for example, pharmaceutical grades of 
mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, 
magnesium carbonate, and the like. These compositions take the form of solutions, 

30 suspensions, tablets, pills, capsules, sustained release formulations or powders and 
contain 10% to 95% of active ingredient, preferably 25% to 70%. Where the 
substance is lyophilised, the lyophilised material may be reconstituted prior to 
administration, e.g. a suspension. Reconstitution is preferably effected in buffer. 
Capsules, tablets and pills for oral administration to a patient may be provided 
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with an enteric coating comprising, for example, Eudragit "S" Eudragit "L" 9 cellulose 
acetate, cellulose acetate phthalate or hydroxypropylmethyl cellulose. Substances of 
the invention and in particular nucleic acids and vectors of the invention may be 
administered by needleless injection, for example, transdermally, may also be used. 
5 The substances of the invention may be formulated as neutral or salt forms. 

Pharmaceutically acceptable salts include the acid addition salt (formed with free 
amino groups of the peptide) and which are formed with inorganic acids such as, for 
example, hydrochloric or phosphoric acids, or such organic acids such as acetic, 
oxalic, tartaric and maleic. Salts formed with the free carboxyl groups may also be 

10 derived from inorganic bases such as, for example, sodium, potassium, ammonium, 
calcium, or ferric hydroxides, and such organic bases as isopropylamine, 
trimethylamine, 2-ethylamino ethanol, histidine and procaine. 

The substances are administered in a maimer compatible with the dosage 
formulation and in such amount will be prophylactically and/or therapeutically 

15 effective. The quantity to be administered depends on the subject to be treated, 

capacity of the subject's immune system to synthesize antibodies, and the degree of 
protection desired. Precise amounts of active ingredient required to be administered 
may depend on the judgement of the practitioner and may be peculiar to each subject. 
A substance of the invention may be given in a single dose schedule, or 

20 preferably in a multiple dose schedule. A multiple does schedule is one in which a 

primary course of administration maybe 1-10 separate doses, followed by other doses 
given at subsequent time intervals required to maintain and or reinforce the immune 
response, for example at 1 to 4 months for a second dose, and if needed, a subsequent 
dose(s) after several months. The dosage regimen will also, at least in part, be 

25 determined by the need of the individual and be dependent upon the judgement of the 
practitioner. Examples of dosages of complex will may be administered include from 
5 ug to 100 mg, preferably from 50]ig to 50 mg, more preferably from 250pLg to 10 
mg. 

In some cases the administered substances may comprise cells. The cells may, 
30 for example, be those comprising nucleic acids or vectors of the invention. In other 
cases the cells may be loaded antigen presenting cells or may be T cells that have had 
antigen presented to them by loaded antigen presenting cells of the invention. Any 
suitable number of cells may be administered to a subject. For example, at least, or 
about, 10 5 , 10 6 , 10 7 , 10 8 , 10 9 cells maybe administered. As a guide the number of 
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cells of the invention to be administered may be from 10 5 to 10 13 , preferably from 10 7 
to 10 11 . fri such cases where cells are administered or present, culture medium may be 
present to facilitate the survival of the cells. In some cases the cells of the invention 
may be provided in frozen aliquots and substances such as DMSO may be present to 
5 facilitate survival during freezing. Such frozen cells will typically be thawed and then 
placed in a buffer or medium either for maintenance or for administration. 

The nucleotide sequences of the invention and vectors can also be used 
administered as outlined above. Preferably, the nucleic acid, such as RNA or DNA, 
in particular DNA, is provided in the form of an expression vector, which may be 

10 expressed in the cells of the individual to be treated. The vaccines may comprise 
naked nucleotide sequences or be in combination with cationic lipids, polymers or 
targeting systems. The vaccines may be delivered by any available technique. For 
example, the nucleic acid may be introduced by needle injection, preferably 
intradermally, subcutaneously or intramuscularly. Alternatively, the nucleic acid may 

15 be delivered directly across the skin using a nucleic acid delivery device such as 
particle-mediated gene delivery. The nucleic acid may be administered topically to 
the skin, or to mucosal surfaces for example by intranasal, oral, intravaginal or 
intrarectal administration. 

Uptake of nucleic acid constructs may be enhanced by several known 

20 transfection techniques, for example those including the use of transfection agents. 
Examples of these agents includes cationic agents, for example, calcium phosphate 
and DEAE-Dextran and lipofectants, for example, lipofectam and transfectam. The 
dosage of the nucleic acid to be administered can be altered. Typically the nucleic 
acid is administered in the range of lpg to lmg, preferably to lpg to lOjag nucleic 

25 acid for particle mediated gene delivery and lOjag to lmg for other routes. 
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The following Examples illustrate the invention. 
Examples 

Example 1 

Methods 



10 Recombinant protein expression and purification 

Recombinant N-terminal histidine tagged SSL7 and SSL9 proteins from S. 
aureus strain NCTC6571 were produced in is. coli using the expression vector pQE30 
containing the genes NCTC6571s.s77 and NCTC6571^/P respectively as previously 
described (Williams,RJ. et aL, Infect. Immun. 68, 4407-4415 (2000)). Embp32 from 

15 S. epidermidis was also expressed as a recombinant N-terminal histidine tag fusion 
protein in E. coli and purified as previously described (Williams,RJ. et aL, Infect 
Immun. 70, 6805-6810 (2002)). 

Crystallisation 

20 Crystals of SSL7 were obtained by the hanging drop vapour diffusion 

technique, at room temperature. Crystals were obtained in two different conditions. 
For the first condition (form I), the well buffer contained 25-30% (w/v) PEG-MME 
2K, 0.2 M ammonium sulphate and 0.1 M MES, pH 6.5. Drops consisted of 1 pi 
recombinant SSL7 at 10 mg/ml and 1 \il well buffer. Crystals had a flat plate 

25 morphology with dimensions up to 0.3x0.2x0.0 1mm 3 . The second condition (form IT) 
had the same protein concentration and drop size, but the well buffer in this case 
consisted of 28% (w/v) PEG 2K and 0.1 M Li 2 S0 4 buffered with 0.1M Tricine at pH 
8.5. In this case the crystals were rod-shaped with approximate dimensions 
0.3x0.05x0.05mm 3 . 

30 

X-ray Data Collection 

Data were collected from cryocooled crystals following immersion in mother 
liquor containing 30% (v/v) glycerol (as described in Garmen,E. & Schneider,T. 
Macromolecular Cryocrystallography. J. Appl Cryst. 30, 211-237 (1997)). 
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Form I 

The data were collected to 2.75 A resolution on line BM14 at the European 
Synchrotron Radiation Facility (ESRF; Grenoble, France), on a Mar Research ccd 
detector. The data were indexed and integrated with Mosflm (Leslie, Joint CCP4 and 
ESF-EAMCB Newsletter on Protein Crystallography 26, (1992)), and scaled and 
merged using Scala (Evans et al, 97-103. 1997. CCLRC, Daresbury Laboratory. Ref 
Type: Conference Proceeding) from the CCP4 suite {Acta Ciyst D50, 760-763 
(1994)). Subsequent analysis was carried out using programs from the CCP4 suite, 
unless otherwise stated. Data collection statistics are provided in Table 1. 
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Table 1: Data Collection and Refinement 



Res. 
(A) 


N ref 
Meas 


Uni. 


■R 1 

-Emerge 
(%) 

All |ffigh 2 


Vol 
All 


High 


Comp 

(%) 
All 


leteness 
High 


Redundancy 
(%) 

All | ffigh 


Form] 


[(P432,2) 


45.5- 
2.75 


59980 


12607 


9.4 


46.0 


14.6 


3.1 


98.7 


99.2 


4.5 


4.3 


Form] 


tt(P2i2i20 


30.0- 
2.7 


99169 


11049 


14.0 


48.0 


4.6 


1.5 


99.9 


99.9 


4.5 


3.9 



Protein 
N 


4 Bave(A 2 ) 


Water 
N 


B ave (A 2 ) 


R-factor 3 (°/ 
Working 5 


o) 

Free 6 


rmsd 
Bonds(A) 


Angles (°) 


Form I 


(P4 3 2,2) 


3127 


43.6 


18 


30.9 


23.5 


27.5 


0.008 


1.35 


Form II 


(P2i2,20 




42.6 


24 


42.5 


23.4 


29.9 


0.011 


1.3 



1 Emerge = zl K i ~ \/^Jm > where Ii is the observed intensity of a reflection and I M is the mean 
intensity of all related reflections. 

2 High: highest resolution shell. Form I: 2.9-2.75A; Form II: 2.85-2.70 A. 

3 R- factor = ^\F obs - F calc \/]TF obs . 
4 MeanB-factor 

5 For the 95. 1% of data included in the refinement. 

6 For the 4.9% of data randomly selected and excluded from refinment. 
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Autoindexing indicated that the crystals belonged to pointgroup P422, with 
cell dimensions a=b=81. 66 A, c=148.04 A and a Matthews coefficient of 3.1 
(corresponding to a solvent content of 60% v/v) for 2 molecules in the asymmetric 
unit. There was a significant peak in the native Patterson map at fractional 
5 coordinates: 0.000, 0.372, 0.500 and no peaks attributable to non-crystallographic 
symmetry in the self-rotation function, indicating the presence of two molecules 
related by a 2-fold rotation parallel to the a-axis in the asymmetric unit. 

Form II 

10 The data were collected to 2.7 A resolution on line ID 14-1 at the ESRF, on an 

ADSC Quantum 4R ccd detector. Data indexing, integration, scaling and analysis 
was carried out as for form I. Autoindexing and analysis of systematic absences 
indicated spacegroup V2\2\2u with a=51.65 A, b=71.59 A, c=103.47 A, and a 
Matthews' coefficient of 2.00, corresponding to a solvent content of 40%. There was 

15 a peak on the self-rotation function at k=1 80°, confirming the presence of two 

molecules in the asymmetric unit. Data collection statistics are provided in Table 1. 

Molecular Replacement 

The co-ordinates for one monomer of SSL9 (pdb-id: 1M4V; Arcus et aL, J. 
20 Biol. Chem. 277, 32274-32281 (2002)) were used for molecular replacement, which 
was carried out with Molrep (Vagin & Teplyakov, J. AppL Cryst 30, 1022-1025 
(1997)). 

Form I 

25 There was a single clear peak in the rotation function (I/crI=5.3, next peak 

I/aI=3.58). Since very few systematic absences were recorded during data 
collection, molecular replacement was carried out in all nine possible spacegroups to 
unambiguously identify screw axes. The best solution was for spacegroup P432i2, 
for which two molecules could be placed in the asymmetric unit, related by the 

30 appropriate translation, with a correlation coefficient of 32.6% (for comparison, the 
best solution in ¥4\2\2 had a correlation coefficient of 26.4%). Sidechains that 
differed between the SSL5 and SSL7 proteins were replaced by alanines in the ■ 
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correctly positioned model, and rigid-body refinement was carried out using CNS 
version 1.1 (Brunger et al t Acta CrysL D54, 905-921 (1998)). At this point, the R- 
factor was 52.2% and the Rfr ee? 54.6%. A round of simulated annealing reduced the 
R- and R free -factors to 44.6 and 48.9% respectively. 

5 

Form II 

Molecular replacement was performed as described for form I. The best 
solution had a correlation coefficient of 37.4% for two molecules in the asymmetric 
unit. Following rigid-body and simulated annealing refinement in CNS, the R-factor 
1 0 and R fr ee were 34. 1 6 and 42.06% respectively. 

Density Modification 

For both crystal forms, cross-crystal averaging, non-crystallographic 
averaging and phase improvement were carried out using Dmmulti (Cowtan, CCP4 
15 Newsletter on Protein Crystallography (1994)) in CCP4 {Acta CrysL D50, 760-763 
(1 994)), prior to calculation of maps for manual rebuilding of the model. 



Model Building and Refinement 

In both cases, manual rebuilding was performed using O (Jones & Kjelgaard, 

20 Methods in Enzymology. Charles W.Carter,J. & Sweet,R.M. (eds.), pp. 173-208 

(Academic Press, 1997), and in later refinement rounds XtalView (McRee, Practical 
Protein Crystallography. Academic Press, San Diego, CA (1993)). Refinement was 
carried out with CNS. Alanine residues in the initial model were exchanged for the 
correct sidechains where positive Fourier difference density could be seen. At a 

25 number of positions the sequence alignment was incorrect and additional rebuilding 
of the chain was required. For the refinement, an overall anisotropic B-factor 
correction and bulk solvent scaling with k=0.36, B=26.9 A 2 , for form I, and k=0.59, 
B=101 A 2 for form II were applied. Noncrystallographic symmetry restraints were 
applied throughout refinement except in the later stages where there was clear 

30 evidence of a difference between the chains. After all protein residues had been 
included in refinement, a number of tightly bound waters were added, where there 
was a 3 rms peak in the difference Fourier, and a 1 rms peak in the 2F D -F C map, and 
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appropriate protein- water hydrogen bonds. At the end of refinement, the R-factor 
was 23.5% and R free , 27.5%, for form I and R=23.4% and R free =29.9% for form IL 
A homology model of SSL9 was created using the program MODELLER 
(Lawkowski et al, J. Appl Oyst 26, 283-291 (1993)) and both SSL7 and SSL5 as 
5 template structures. 

Structural illustrations were drawn with Bobscript, ((Esnouf, Acta Oyst D55, 
938-940 (1999)) a modification of molscript (Kraulis, J. Appl Oyst. 24, 946-950 
(1991) and rendered with Raster 3D (Merritt & Bacon, Methods in Enzymology 277, 
505-524 (1997) and Bacon & Anderson, Journal of Molecular Graphics 6, 219-220 
10 (1988)). 



FITC labelling ofSSLs 

SSL7 and SSL9 were dialysed against labelling buffer (0.2 M NaHC0 3 , pH 
9.0) overnight at room temperature (RT). 50 \il of 1 mg/ml fluorescein 

15 isothiocyanate (FITC, Sigma) in dimethyl sulfoxide (DMSO) was added to 1 ml of a 
2 mg/ml protein solution. After 4 hours incubation at room temperature in the dark, 
unbound FITC was removed by size exclusion chromatography using a PD-10 
(Sephadex™) column. The concentration of labelled protein, and the FITC:protein 
ratio were determined by spectrophotometry. All preparations gave FITC:protein 

20 ratios of between 1 : 1 and 2:1. 

Antibodies 

The following monoclonal antibodies (MAbs) were used: CD2 (mouse MAb 
MAS 593, IgG 2 b; Harlan), CD3 (supernatant mouse MAb UCHT1, Igd; obtained 
25 from P. C. L. Beverley [Edward Jenner Institute for Vaccine Research, Compton, 
UK]), CD14 (supernatant mouse MAb HB246, IgG 2 b; gift from P. C. L. Beverley), 
and CD19 (supernatant mouse MAbBU12, IgGi; gift fromD. Hardie [Birmingham 
University, Birmingham, UK]). 



30 Cell culture 

Human PBMC-derived dendritic cells (DC) were generated from fresh whole 
blood samples obtained from healthy volunteers (Alderman et al, Cardiovasc. Res. 
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55, 806-819 (2002) and Newton etal, Clin. Exp. Immunol 133, 50-58 (2003)). 
Mononuclear celjs separated on Lymphoprep ™ (Nycomed Pharma) by 
centrifugation at 400 g for 30 minutes were incubated in six-well tissue culture plates 
for 2 h at 37°C in 5% C0 2 in complete medium (CM)(RPMI 1640 medium (Gibco) 
5 supplemented with 10% fetal calf serum (FCS; PAA Laboratories), 100 U/ml 

penicillin, 100 jxg/ml streptomycin, and 2 mM L-glutamine (Clare Hall Laboratories, 
Imperial Cancer Research Fund)). The adherent cells were cultured in fresh complete 
medium with 100 ng/ml human recombinant granulocyte-macrophage colony- 
stimulating factor (GM-CSF) and 50 ng/ml interleukin (IL)~4 (Schering-Plough 

10 Research Institute). On day four of incubation, loosely adherent cells were collected, 
and contaminating T and B lymphocytes were removed by incubation with CD3, 
CD2, and CD 19 MAbs, followed by anti-mouse IgG-coated immunomagnetic 
Dynabeads ™ (Dynal). The supernatant, containing highly purified DC was cultured 
for another three days in fresh complete medium with GM-CSF and IL-4. Human 

1 5 PBMC-derived macrophages were obtained using the same procedure for dendritic 
cell culture, except that 10% human serum was used and no cytokines were added 
(Swetman et al, Eur. J. Immunol 32, 2074-2083 (2002)). 

Binding and uptake of FITC labelled SSLs by human cells 
20 Binding assays were performed by incubating 10 6 cells/well in complete 

medium with various concentrations of SSL-FITC (0.05-1.25 jjM) for 1 hour at 4°C 
or 37°C. In some experiments, 8 pM of unlabelled SSL was added to the cells 
together with the labelled protein. After incubation, cells were washed three times by 
centrifugation, and examined by flow cytometry. In some experiments, cells were 
25 additionally stained for various surface markers after SSL uptake. Cells were 

incubated with the relevant MAb for 30 min at 4°C, washed, and then incubated in 
1: 25-diluted phycoerythrein-conjugated goat anti-mouse immunoglobulin (PE, 
Jackson ImmunoResearch) for 30 min at 4°C. Cells were washed, fixed in 2% 
formaldehyde and examined using a FACScan flow cytometer (Becton Dickinson). 

30 
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Confocal microscopy 

10 5 cells were seeded on 32mm coverslips coated (for dendritic cells only) 
overnight at 4°C with lOjag/ml fibronectin (FN, Sigma) in HBSS (Gibco). After 2 
hours at 37°C in complete medium, cells were incubated with SSL-FITC (L25 juM) 
5 and/or Texas Red-dextran (Img/ml, Molecular Probes) for 1 hour at 37°C in 

complete medium. The coverslips were then washed three times in cold HBSS and 
fixed in 2% paraformaldehyde. The slides were examined on a Bio-Rad Confocal 
Microscope. Images were acquired from 0.5-jxm optical sections of individual cells. 

10 Results 

Stmcture determination of SSL 

Recombinant SSL7, consisting of residues 36-231 of the sequence with 
accession number AF094826 in GenBank, was crystallised in two different 
1 5 conditions each of which gave rise to a different crystal form. 

Form I 

The final, 2.75A resolution, model built into the electron density map 
contained two SSL7 molecules, representing residues 18-213 of the recombinant 

20 SSL7. In addition, eighteen water molecules were included at stereochemical^ 
sensible locations. Though the His-Tag and N-terminal tail were disordered and 
omitted from the density, the majority of the residues had well-defined electron 
density. The final R-factor and Rfr ee were 23.5% and 27.5% respectively, and 
refinement statistics are given in Table 1, The final model had good stereochemistry, 

25 with 98.6% of residues in the most favoured and additionally allowed regions of the 
Ramachandran plot, and no residues in disallowed regions. The geometry was better 
than expected for the average 2.75 A structure according to PROCHECK analyses 
(Lawkowski et al, J. Appl Cryst 26, 283-291 (1993)). 

30 Form II 

This model again contained two SSL7 molecules: in this case residues 21-213 
or 23-213 of the construct in chains A and B respectively. Twenty-four water 
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molecules were added in this form. Once again, with the exception of the His-tag and 
N-terminal tail, the majority of the molecule had well-defined electron density. The 
final R- and R^-factors were 23.4 and 29.9% respectively. The final model had 
good stereochemistry with 96.5% of the residues in the two most favoured regions of 
5 the Ramachandran plot. The geometry was better than expected for the average 2.7 
A structure according to PROCHECK. Refinement statistics for both crystal forms 
are provided in Table 1. 

Tlie SSL7 structure 

1 0 The structure of the SSL7 monomer is shown in Figure 1 (a). As predicted 

from sequence comparisons, the fold is similar to that of the bacterial superantigens, 
and consists of two domains. The N-terminal domain (residues 18-1 10) is an OB- 
fold, a variety of p-barrel associated with oligosaccharide and DNA binding, while 
the C-terminal domain (1 1 1-213) forms a P-grasp domain: a series of p-strands 

1 5 wrapped around a helix. 

In total, the structure of four copies of the SSL7 monomer was obtained (two 
from each crystal form), and they are all very similar, as can be seen from the Ca- 
atom root mean square deviations (rmsds; Table 2). 

20 Table 2: Root Mean Square deviations (A) between the different copies of the 
SSL7 monomer. 
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Form II 
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25 The only differences between the structures arise from differences in the 

conformations of flexible loops, and the linker between the N- and C-terminal 
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domain (residues 106-112). The relative orientation of the two domains to one 
another, and the orientations of the individual secondary structural elements remain 
unchanged in the different copies of the monomer. For this reason, unless explicitly 
stated, one example monomer, chain A from form I, will be used in the comparisons 
5 and discussions that follow. 

SSL7 and other proteins 

When the SSL7 monomer is superposed on the structure of SSL7 (SET3) (the 
structure of SSL5 is provided by Arcus et al, J. Biol Chem. 277, 32274-32281 

1 0 (2002)), the other member of the family for which a three dimensional structure is 

available, the two are seen to share the same fold (Figure 1 (b)), as might be expected 
for proteins sharing 40% sequence identity. However, when optimally superposed, 
the rmsd, which is 1.33 A over 157 spatially equivalent Ca atoms, is surprisingly 
high for two such highly related proteins (a value of about 1 . 1 A over the whole 

1 5 structure would be anticipated from sequence identity (Chothia & Lesk, EMBO J. 5, 
823-6 (1986)). 

The high rmsd can be largely accounted for by changes in two regions of the 
structure. Firstly, there is a change in the twist of the (3-sheet in the C-terminal P- 
grasp domain (Figure 1, (b) right hand structure-indicated by arrow). The change in 

20 the twist results in shifts in individual residue positions as large as 6.65 A (for the Ca 
atom of G125 in the C-terminal domain). Secondly, there are changes to the 
conformations of the loops on the external face of the N-terminal OB-fold (Figure 1 
(b), left hand structure- indicated by arrow), these loops are associated with a generic 
low affinity MHCII binding site in superantigens (Jardetzky et aL 9 Nature 368, 711-8 

25 (1994) and Kim et al., Science 266, 1870-1874 (1994)), and changes in them may 
indicate differences in function between the two proteins. These large movements 
account for the low contrast and large R-factor for the initial molecular replacement 
solution, prior to the simulated annealing which successfully realigned the sheet 
strands, and some of the loop residues. 

30 The SSL7 structure was also compared to a homology model of SSL9, based 

on both SSL7 and SSL5 as template structures (not shown). The sequence of SSL9 
has a much greater homology to SSL7 (sequence identity 49%) than SSL5 (sequence 
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identity 35%). This is reflected in the model of SSL9: when the structures are 
optimally superposed SSL9 has a Ca atom rmsd of 0.6 A over 188 spatially 
equivalent atoms from SSL7, and 1.4 A over 177 spatially equivalent atoms from 
SSL5. 

5 When SSL7, and the superantigen of known structure with which it shares the 

highest sequence identity (streptococcal pyrogenic exotoxin, SPEC; 29% - Roussel., 
Nat Struct. Biol 4, 635-43 (1997)) are optimally superposed (Figure 3), it is seen 
that once again the overall fold is conserved. SPEC has some extended loops, but the 
structures are otherwise very similar. Interestingly, despite the SPEC and SSL7 

1 0 sequences being far more divergent than SSL7 and SSL5, the two structures 

superpose nearly as well. The Ca atom rmsd for optimally superposed SSL7 and 
SPEC is 1.48 A over 134 structurally equivalent atoms. The difference between the 
structures on this occasion being both a slight change in the orientation of the |J- 
grasp domain (Figure 1, (c) right hand structure, indicated by arrow), but also large 

15 differences in the orientations of the strands of the OB-fold (Figure 1, (c), left hand 
structure-indicated by arrow) the very large changes in conformation in this domain 
are not unexpected since the SSL proteins do not bind MHC II, and this is the region 
primarily involved in this interaction in the superantigens. 

20 Dimerisation 

In both crystal form I and II there are two molecules in the crystallographic 
asymmetric unit; these two molecules are related by a proper two-fold, resulting in 
the formation of an intimate dimer (Figure 1(d)). The dimer is virtually identical in 
both crystal forms, as can be seen from a comparison of the residues buried in the 

25 dimer interface (Figure 2), and the fact that the form I and form II dimers can be 
superposed with an all Ca atom rmsd of 0.881 A or 0.902 A depending on 
orientation. The remainder of the crystal packing is entirely different for each of the 
forms, since the crystals grew in different conditions and at markedly different pH; it 
is unlikely that the dimers formed solely as a result of crystal packing forces. 

30 As can be seen from Figure 1(d) the dimer interface is the result of the two [3- 

grasp domains interacting to create an intermolecular p-sandwich. In the process, 
1 122 A 2 of the monomer surface in form one and 1 146 A 2 of this surface in form II 
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are buried: this is in the range seen in biologically relevant dimers (Jones & 
Thornton, Progress in Biophysics and Molecular Biology 63, 31-59 (1995)) . The 
dimer formation results in the burial of a number of hydrophobic residues (including 
Fl 19, L128, 1132) and a number of neutral polar residues contribute to a hydrogen 
5 bonding network between the two monomers, however, no charge-charge 

interactions are created. It has been shown that protein-protein interaction surfaces 
differ little from the 'normal' exterior surface of proteins, but that they tend to 
contain additional neutral polar residues and fewer charged ones (Lo Conte & Janin, 
J. Mot Biol 285, 2177-98 (1999)): the SSL7 dimer interface is entirely consistent 
10 with this. Figure 1 (d) also indicated with an arrow the loop with the largest 

difference between SSL7 and SSL5, the movement of which prevents steric clashes 
between the two SSL7 molecules in the dimer. 

Cellular fropism ofSSL7 and SSL9 

1 5 Peripheral blood mononuclear cells (PBMC) were incubated with various 

concentrations of SSL7-FITC or SSL9-FITC for 1 hour, and cell-associated 
fluorescence measured by flow cytometry. Both SSL7 (Figure 3(a)) and SSL9 (not 
shown) stained a small proportion of PBMC at 37°C, but not at 4 °C. The level of 
fluorescence was dose-dependent up to a maximum at 1.25 jjM protein. The mean 

20 percentage of cells stained with SSL7 (9.8+1.8, range 7.1-12.2, n=7) and SSL9 

(10.9+1.1, range 9.4-12.6, n=5) was very similar. Mean fluorescence also increased 
with time between five minutes and a hundred and twenty minutes (not shown) 
suggesting progressive uptake of SSL protein by the cells. 

In order to determine whether the interaction between SSL protein and 

25 PBMC was specific, competitive inhibition of SSL-FITC cell labelling by unlabelled 
SSL was investigated (Figure 3(b)). Excess unlabelled SSL7 was able to completely 
block uptake of SSL7-FITC. In contrast, neither SSL9, nor an unrelated bacterial 
protein also carrying a polyhistidine tag (Embp32) had any effect on the SSL7-FITC 
signal. Conversely, only unlabelled SSL9, but not SSL7 or Embp32, blocked uptake 

30 of SSL9-FITC. Interaction between SSL proteins and the PBMC therefore occurs via 
a saturable specific receptor, and is not mediated by the histidine tag on these 
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proteins. Furthermore, SSL7 and SSL9 use different receptors, or different sites 
within one receptor. 

The PBMC sub-populations which are the targets for SSL7 and SSL9 were 
further characterised by immunophenotyping, using monoclonal antibodies to the 
5 major surface markers CD2, CD3, CD14, and CD19 (Figure 4). Both SSL7 and SSL9 
were taken up by all CD 14 positive cells, and by a population of CD2-low cells, a 
phenotype consistent with that of peripheral blood monocytes (Crawford,K. et al, J. 
Immunol 163, 5920-5928 (1999)). Neither SSL7 nor SSL9 showed any interaction 
with CD3 positive T cells. Interestingly, SSL7-FITC but not SSL9-FITC stained a 
1 0 subpopulation of CD 1 9 B cells, providing further evidence that the receptor for these 
two SSLs is distinct. 

Uptake of SSLs by dendritic cells 

Peripheral blood monocytes were cultured in vitro in the presence of GM- 

1 5 CSF and IL-4, in order to drive their differentiation into myeloid dendritic cells 

((Sallusto & Lanzavecchia, /. Exp. Med. 179, 1 109-1 1 1 8 (1994)). After depletion of 
residual lymphocytes, the population obtained after seven days culture consisted of 
>90% CDla+ HLA-DR high CD 14 low dendritic cells (data not shown). 

These cells were incubated for sixty minutes at 37°C with either SSL7-FITC 

20 or SSL9-FITC and examined by flow cytometry Figure 5 and confocal microscopy 
(data not shown). Dendritic cells stained uniformly strongly positive for both SSL7 
and SSL9. Confocal microscopy confirmed that fluorescence was predominantly due 
to intracellular uptake of SSL, rather than surface staining. Both SSL7 and SSL9 
were concentrated in small vesicular structures, localised particularly to the 

25 perinuclear region of the cell. In order to characterise the nature of these vesicles 
further, dendritic cells were cultured in the presence of SSL7 or SSL9-FITC and 
Texas Red dextran (data not shown), which is avidly taken up by dendritic cells via 
mannose receptors (Sallusto et al, J. Exp. Med. 182, 389-400 (1995)). Texas Red 
dextran strongly labelled a large number of intracellular vesicles throughout the 

30 dendritic cell cytoplasm. SSL distribution and dextran distribution partially 
overlapped, with some intracellular vesicles clearly containing both markers. 
However, SSL positive dextran negative vesicles were also observed. In a small 
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proportion of cells, vesicles containing SSL9 appeared to aggregate, to generate very 
large vesicles, which contained high concentrations of both SSL and dextran. The 
very large vesicles observed (which were never seen in the presence of dextran 
alone) presumably resulted from fusion of many SSL containing vesicles, and may 
5 have been driven by intramolecular interactions between SSL molecules (as observed 
during dimerisation in crystal structure). Similar vesicle distortion is observed in the 
presence of excess invariant chain, again driven by multiple interactions between 
invariant chain molecules (Romagnoli et al 9 J. Exp. Med. 177, 583-596 (1993). 

In order to determine whether uptake of SSLs was a generalised feature of 

1 0 endocytic cells, peripheral blood monocytes were differentiated into macrophages, 
via culture in human serum, without added cytokines. Under these culture conditions, 
the cells develop a completely different phenotype (CDla-,HLA-DR-,CD14 high) 
and morphology (lack of dendrite formation) (Swetman et al, Eur. J. Immunol. 32, 
2074-2083 (2002). Macrophages, like dendritic cells efficiently endocytosed Texas 

1 5 Red dextran, but showed no uptake of either SSL7 or SSL9 (data not shown). 

Discussion 

One of the most exciting results from the structural studies of SSL7 was the 
identification of an identical SSL7 homodimer in the asymmetric unit of two 

20 otherwise very different crystal forms, grown from very different solution conditions. 
The dimer has a number of characteristics seen in functionally relevant dimers, and 
which indicates that the dimer is not purely an artefact of crystallization. 
Interestingly, the structure of SSL5 did not reveal any such dimer formation (Arcus 
et al, J. Biol Chem. 277, 32274-32281 (2002) and the residues making up the 

25 interface are not conserved across the different SSL proteins. 

The change in the orientation of the P-grasp domain in SSL7 relative to SSL5 
is necessary to allow the dimer to form: if a similar dimer is created from SSLS 
monomers, clashes occur between residues 1 10-1 14 in one monomer and 197-200 in 
the other, and more seriously between residues 118-125 and 161-165. These clashes 

30 are alleviated by the change in the orientation of the p-grasp domain p-strands in 
SSL7, further suggesting that SSL7 does not form a dimer in the same way. 



64 



WO 2005/092918 PCT/GB2005/001084 

Since no crystal structure of SSL9 is yet available, a preliminary comparison with 
SSL7 was carried out using a homology model. As for SSL5, the residues involved 
in the dimer interface in SSL7 are not conserved between SSL7 and SSL9. However 
the sequence changes do not create steric or electrostatic clashes between the two 
5 monomers; rather they are such that hydrogen-bonding between the two molecules is 
maintained, and in some cases, new hydrogen-bonds are formed. SSL9 may therefore 
form a dimer in the same manner as SSL7. 

SSL7 and SSL5 show differences in the region of the N-terminal domain that 
are implicated in a general low-affinity MHCII binding site in superantigens. The 

10 homology model reveals that the residues in these loops are in general highly 
conserved between the SSL7 and SSL9 sequences. However there are some 
important differences, including the change of P93 in SSL7 to threonine in SSL9. 
This proline is part of a well ordered p-tum in SSL7, while in SSL5 there is no 
ordered secondary structural element present here: in fact this in SSL5 loop is rather 

1 5 disordered. The pattern of sequence conservation in the N-terminal loops indicates 
that the structures of SSL7 and SSL9 are more related to one another than to SSL5 
and this may also be reflected in the functional properties of the molecules. 
However, the small number non-conservative sequence changes between SSL7 and 
SSL9 are also entirely consistent with differences in their putative receptor binding, 

20 as discussed further below. 

Studies of the superantigens have shown that their interactions with MHCII 
and T cell receptor molecules are diverse, encompassing a number of different 
interaction surfaces and stoichiometrics. This includes the formation of functionally 
important superantigen dimers for some superantigens, for example the Zn 2+ - 

25 dependent dimers formed by staphylococcal exotoxin D (Sundstrom et al. 9 EMBO J. 
15, 6832-40 (1996)), which form via the C-terminal p-grasp domain, in a manner 
reminiscent of the homodimers seen of SSL7. It also includes the formation of 
heterodimers using the same surface of the N-terminal OB-fold but different surfaces 
of MHC molecules (as in the complexes of HLA-DR1 with SEB and TSST-1 

30 respectively - Jardetzky et aL 7 Nature 368, 711-8 (1994) and Kim et aL, Science 266, 
1870-1874 (1994)). 
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The functional and structural studies described here show that SSL7 and 
SSL7, may be functionally active in different quaternary states. 
The most significant differences between the structures of SSL7, SSL5 and other 
superantigens suggest that these molecules may interact with different binding 
5 partners, and this is supported by the studies of cellular tropism. The characteristic 
features of the interaction between SSL7 and SSL9, and PBMC are specificity, 
temperature dependence and cell selectivity. Specificity, indicative that the 
interaction is mediated by a cell surface receptor, is shown by the demonstration that 
unlabelled SSL blocks uptake of SSL-FITC. This competition is observed for both 

1 0 SSL7 and SSL9, ruling out the hypothesis that the results are due to significant 
differences in affinity of binding between the two. 

The lack of reciprocal inhibition between SSL7 and SSL9 indicates that these 
two molecules have different binding partners on the cell surface, although the 
possibility that they bind to different sites on the same molecule cannot be ruled out. 

15 Although the binding sites of SSL9 and 5 on the cell surface are distinct, both are 
able to self-target to APCs. Since it was impossible to measure binding in the 
absence of uptake, true measurements of affinity could not be obtained. The 
concentrations required to obtain measurable uptake, however, were in the order of 
0.1 micromolar, suggesting that the affinity of interaction with any putative receptor 

20 is relatively low. This is a characteristic of many classical superantigens (Labrecque 
et al, Semin. Immunol 5, 23-32 (1993)). 

The temperature and time dependence of SSL interaction are suggestive of 
receptor mediated uptake rather than simple binding to the cell surface, and this was 
confirmed by the confocal microscopy studies discussed further below. However, a 

25 small amount of surface binding can be detected at 37°C, but not 4°, using indirect 
labelling of intact cells with an antibody against the histidine tag (not shown). The 
interaction of SSL with the receptor, as well as its subsequent uptake, is therefore 
temperature-dependent. 

The third characteristic of SSLs observed in these cellular studies with PBMC 

30 was the highly selective nature of the target population with which interaction could 
be detected. In ex vivo PBMC, the major target population is the monocyte, 
characterised by high expression of CD 14. Essentially all monocytes were found to 
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interact with both SSL7 and SSL9. In contrast, neither SSL7 nor SSL9 interacted 
with T cells, identified by expression of CD3 and high levels of CD2. Interestingly, 
SSL7, but not SSL9 also bound to a proportion of B cells (in the order of 30% 
although this varied significantly between individuals), providing further evidence 
5 that the receptor for these two molecules is distinct. Since a very significant 

proportion of T cells, and all human B cells also express class II MHC (e.g. HLA- 
DR) this result rules out a direct binding of SSL7 or SSL9 to these molecules, thus 
clearly distinguishing them from classical superantigens. 

Monocytes express both class I and class II MHC molecules, and can act as 

10 antigen presenting cells for the activation of CD4 or CD8 T cells. However, the 

prototype antigen presenting cell, and the only cell type which can activate naive T 
cells, is the dendritic cell. It was therefore of interest that both SSL7 and SSL9 were 
taken up efficiently by monocyte-derived dendritic cell and hence both molecules 
can self target to this important class of antigen presenting cells. This cell type, 

1 5 which can be obtained by culture of PB monocytes in appropriate cytokines, provides 
a widely used model for myeloid dendritic cells. In contrast, neither SSL7 nor SSL9 
showed any tropism for macrophages, a cell type also produced by in vitro culture of 
monocytes, but which has no antigen presenting capabilities. 

Studies indicate that antigen presenting cell activity remains intact in the 

20 presence of SSLs. Conversely, self-targeting to antigen presenting cells results in 

enhancing the immunogenicity of these proteins. The uptake of SSL7 and SSL9 into 
an endosomal compartment which intersects with the dextran uptake pathway 
indicates that the SSL7 are successfully targeted to the antigen presentation pathway. 
This is because uptake via the mannose receptor efficiently targets antigens to the 

25 Class II MHC antigen processing pathway (Sallusto et aL, J. Exp. Med. 182, 389- 
400 (1995)). Although enhancing immunogenicity would, at first sight, appear to be 
paradoxical, the generation of an antibody response to a secreted protein is unlikely 
to confer any advantage in bacterial clearance by the host. On the contrary, the 
interaction between secreted toxin and specific antibody in the microenvironment of 

30 the bacterium may activate complement and hence contribute to the breakdown of 
the physical barriers that restricts the invasiveness of these bacteria. 
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SSLs therefore appear to provide S. aureus with an alternative molecular 
strategy with which to distract the protective adaptive immune response of the host, 
and contribute to bacterial pathogenicity. Specifically, SSLs achieve this through 
their ability to target antigen presenting cells. 

5 

Example 2 

The work described in Example 1 shows SSLs {staphylococcal superantigen 

like proteins) interact selectively with antigen presenting cells, including dendritic 
10 cells. The functional consequences of this interaction are now examined further. We 

show that SSL uptake does not adversely effect any of the parameters of antigen 

presenting cell function examined using dendritic cells. SSL7 and 9 were found to 

have no effect on viability or morphology of dendritic cells. The proteins did not 

induce dendritic cell maturation, as measured by cell surface phenotype. Exposure to 
15 SSL did not alter the ability of dendritic cells to take up FITC-dextran, In addition, 

exposure to SSLs did not impair the ability of the dendritic cells to stimulate 

allogeneic or antigen specific T-cell responses. 

The ability of antigen presenting cells to present SSLs was also examined. 

Dendritic cells loaded with SSL7 or 9 were able to stimulate a T-cell proliferative 
20 response in three out of eight healthy individuals tested. Sera from nine out of ten 

individuals tested contained antibodies against both SSL7 and SSL9, and the 

response to each SSL was specific and not cross-reactive. 

The results obtained demonstrate that SSLs can be used to specifically target 

antigen presenting cells and gain access to the antigen presentation pathway of these 
25 cells. SSLs may therefore be utilised to specifically deliver chosen antigens to 

antigen presenting cells in order to elicit an immune response against the chosen 

antigen. 

Methods 

30 
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Recombinant protein expression and purification 

Recombinant N-terminal histidine tagged SSL7 and SSL9 proteins from S. 
aureus strain NCTC6571 and Embp32 from S. epidermidis were produced as 
described above in Example 1 . SSL proteins without histidine tag behaved 
5 identically to the tagged version in terms of cell binding and uptake (Al Shangiti AM 
et al, Infect. Immun., 72:4261-70 (2004). 

Antibodies 

The following monoclonal antibodies (MAbs) were used: CD2 (mouse MAb 
10 MAS 593, IgG 2b ; Harlan), CD3 (supernatant mouse MAb UCHT1, Igd; gift from P. 
C. L. Beverley [Edward Jenner Institute for Vaccine Research, Compton, UK]), 
CD 14 (supernatant mouse MAb HB246, IgG 2 i>; gift from P. C. L. Beverley), and 
CD19 (supernatant mouse MAb BU12, IgGi; gift from D. Hardie [Birmingham 
University, Binningham, UK]), HLA-DR (supernatant mouse MAb L243, IgG 2a ; gift 
15 from P. C. L. Beverley), HLA-ABC (W6/32; Serotec), CD86 (supernatant mouse 

MAb BU63, IgGl; gift from D. Hardie), and CD54 (Mouse IgG MEM-1 11, gift from 
Prof. Horejsi, Academy of Science, Prague, Czech Republic). Fluorescein 
isothiocyanate (FITC) conjugated anti-mouse rabbit polyclonal antibody was 
purchased from Dako. PE-conjugated anti-CD la (Monoclonal Mouse IgGl, clone 
20 BL6, Immunotech, Marseille, France). The rabbit polyclonal antibody to His-SSL7 
was produced under contract by Eurogentec Ltd (Southampton, UK) and validated 
by Western blot and ELISA. 

Dendritic Cell Preparation 

25 Monocyte-derived human dendritic cells (MDDC) were generated from fresh 

whole blood samples obtained from healthy volunteers as described previously (Al 
Shangiti et al, (2004), supra). Briefly, mononuclear cells separated on 
Lymphoprep™ (Nycomed Pharma) by centrifugation at 400g for 30 mins were 
incubated in 6-well tissue culture plates at 37°C in 5% C0 2 in complete medium 

30 (CM)(RPMI 1640 medium (Gibco) supplemented with 10% fetal calf serum (FCS; 
PAA Laboratories), 100 U/ml penicillin, 100 yg/ml streptomycin, 2 mM L-glutamine 
(Clare Hall Laboratories, Imperial Cancer Research Fund), 100 ng/ml human 
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recombinant granulocyte-marcrophage colony-stimulating factor (GM-CSF) and 50 
ng/ml interleukin (EL)-4 (Schering-Plough Research Institute). On day four of 
incubation, loosely adherent cells were collected, and contaminating T and B 
lymphocytes were removed by incubation with CD3, CD2, and CD19 MAbs, 
5 followed by anti-mouse IgG-coated immunomagnetic Dynabeads™ (Dynal). The 
non-adherent fraction, containing highly purified dendritic cells (less than 5% CD3, 
CD 19 or CD 14) was cultured for another three days in fresh culture medium with 
GM-CSF and IL-4. 

10 Cell viability 

10 5 dendritic cells/group were cultured with SSL7 or SSL9 (4.16 pM) in a 
96-well plate, and the plate was incubated at 37°C in 5% C0 2 overnight. Cell 
viability was assessed by trypan blue exclusion assay. 10 ]il of cells suspension were 
diluted in an equal volume of trypan blue solution. 10 yl of this mix were loaded on 

15 a haemocytometer counting chamber placed under the microscope and white live 
cells (dead cells turn blue) were counted with in a 4 x 4 square grid. 

Cell surface phenotype expression 

Dendritic cell surface staining was performed by using a panel of monoclonal 

20 antibodies (MAbs) directed against surface antigens expressed by dendritic cells and 
the appropriate specific isotype controls. Briefly, 10 5 cells were pre-incubated for 24 
hours in culture medium with 4. 16 pM of SSL7 or SSL9 or with PG (5 ug/ml, from 
S. aureus, Sigma) or purified LPS (100 ng/ml; Salmonella Minnesota, Sigma) in 96 
well U-bottomed plates at 37°C. Cells were resuspended in 100 \il of staining buffer 

25 (HBSS, 1% FBS, 0.1% sodium azide), and incubated first with the relevant MAbs for 
30 mins at 4°C. Cells were washed, and secondary immunolabeling was performed 
using FITC-conjugated rabbit anti-mouse immunoglobulin (30 min, 4°C). Cells were 
washed three times and fixed in 3.8% paraformaldehyde and examined within 5 days 
on a FACScan flow cytometer (Becton Dickinson). Data were analysed using 

30 CellQuest software. 
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Endocytosis assay 

Dendritic cells (10 5 ) were incubated in culture medium with or with 4.16 uM 
of SSL7 or SSL9 for various times (1 or 18 hours) in 96 well U-bottomed plates at 
37°C. Different concentrations (1 5 3, 10 and 30 ug/ml) of FITC-dextram (40,000 
5 MW) were incubated with the cells. After 1 hour of incubation at 37°C, cells were 
washed in ice cold HBSS containing 0.1% azide to stop further endocytosis, fixed 
with 3.7% formaldehyde, and analysed by flow cytometry. The uptake of dextran is 
expressed as mean fluorescent intensity. For each sample at least 5000 events gated 
on dendritic cells were analysed. 

10 

T-rcell Proliferation Assays 

Autologous T-cells were obtained from non-adherent population of peripheral 
blood mononuclear cell fraction from eight healthy volunteers (age range 20-50, 
median approximately 30) and Cryopreserved in FCS containing 10% DMSO 

15 (Sigma Aldrich) at -70°C. Cells were thawed rapidly (37°C), and B cells, monocytes, 
and macrophages were depleted by incubation with CD 19, HLA-DR and CD 14 
MoAb for 45 minutes on ice. Cells were washed and then mixed with magnetic 
microbeads and separated on magnetic columns. T-cells (greater than 90% purity) 
were used immediately after purification. Allogeneic T-cells used in mixed 

20 leucocyte reactions (MLR) were prepared from HLA-mismatched donors in same 
procedure. Purified dendritic cells (10 4 ), either untreated or treated for 18 hours to 
different concentration of SSL proteins (4.16, 1.25 and 0.42 uM), were incubated at 
37°C/5% C0 2 with autologous T-cells (2 x 10 5 cells/well) in the presence of purified 
protein derivative (PPD) or with allogeneic T-cells in flat-bottomed 96-well 

25 microtiter plates. The dendritic cell autologous and allogeneic T-cell cocultures were 
incubated 6 days. Both assays were then pulsed with 1 pCi of [ 3 H]thymidine (ICN 
Biomedical, High Wycombe, United Kingdom) for the final 1 8 hours of culture. 
Cells were harvested, and T-cell proliferation was measured by liquid scintillation 
counting (Microbeta Systems). All assays were performed in triplicate. Results 

30 were express as cpm. Error bars represent the standard deviation (SD). 
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Cytokines assays 

Autologous T and dendritic cells were incubated with different concentrations 
of SSL proteins (4. 1 6, 1 .25 and 0.42 pM) at 37°C/5% C0 2 in 24-well plates. After 4 
days, cell culture supernatants were centrifuged for the removal of cells and stored at 
5 -70°C. Cytokine detection was done by enzyme-linked immunosorbent assay 
(ELISA) for interleuldn-10 (IL-10, Pharmingen, UK), gamma interferon (TFN-y, 
Pharmingen, UK) and IL-13 (LnmunoTools, Germany). Purified protein derivative 
(PPD; 500 U/ml) was used as positive control. 

1 0 Antibody detection 

Human serum was collected from heparinised blood often normal individuals 
(three females, seven males, age range 20-50, median 30). ELISA miscroassay 
plates were coated for 24 hours with 0.04 pM of SSL proteins dissolved in sodium 
carbonate buffer 0.1 M, pH = 9.5 at 4°C (100 p.l/well). After three successive washes 

1 5 with HBSS containing 0. 1 % Tween 20™, blocking was performed with 1% 

skimmed milk in HBSS for 1 hour at 37°C. The plates were washed again three 
times, as before, and the tested sera diluted at 1:2000 in HBSS with 0.1% Tween™ 
were added to the SSL coated plates for 1 hour at 37°C (preliminary studies showed 
that this diluation of antisera gave no background staining for any serum tested in 

20 control wells). After three additional washes, the remaining bound antibodies were 
incubated for 1 hour at 37°C with alkaline phosphates - conjugated human antibodies 
diluated at 1 : 1000 in HBSS with 0. 1% Tween 20™. Excess conjugate was removed 
by washing as above and a colorimetric reaction was carried out by addition of the 
chromogen OPD (o-Phenylenediamine dihydrochloride) for 15-20 minutes. The 

25 plates were read (405 nm) to detect the optical density (OD) readings. A control well 
containing no serum was used to detect the background count. A polyclonal rabbit 
antibody raised against purified His-tagged SSL was included as a positive control 
(anti-SSL7). 

Competitive ELISA was performed by mixing the sera with differing 
30 concentrations of SSL or a control bacterial protein Embp32 (0.08, 0.17, 0.33 and 
0.42 ]iM), and then testing binding on SSL-coated plates as above. 
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Statistical analysis 

The means of paired groups were analysed by a 2-tailed Student's t test. The 
level of significance was O.05. 



5 Results 



Hie effects of SSL protein on DC viability \ morphology and surface phenotype 

Monocyte-derived dendritic cells (MDDC) were derived from peripheral 
blood monocytes. They showed the characteristic phenotype with low CD 14, and 

10 high HLA-DR and high CD la. As shown in Figure 1, fluoresceinated SSL7 and 
SSL9 labelled CD la expressing cells dendritic cells in unpurified dendritic cell 
cultures, while residual contaminating cells (predominantly lymphocytes) did not 
show any interaction with either protein. Nevertheless, in order to exclude the 
possibility of indirect effects mediated on dendritic cells via some other cell type all 

15 further experiments were performed on purified dendritic cell cultures, containing 
less than 5% non-dendritic cells. 

Dendritic cells were incubated with SSL7 or SSL9 (4.16 uM) for either 1 or 
18 hours at 37°C and cell viability was assessed by trypan blue exclusion. 
Microscopic analysis revealed that SSL proteins were not cytotoxic as more than 

20 95% of cells appeared viable. Untreated cells were predominantly non-adherent with 
few dendritic cell processes, typical of immature dendritic cells (Figure 7 A, top 
panel). Neither SSL7 or SSL9 induced any noticeable morphological changes over 
the time period tested (Figure 7 A, middle panels). In contrast dendritic cells treated 
with the TLR4 bacterial ligand LPS (100 ng/ml) or peptidoglycan (PG) 5 ug/ml) 

25 became adherent and extended multiple, long dendritic processes (Figure 7 A, bottom 
panels). The cell-surface expression of a panel of characteristic dendritic cell surface 
markers was analyzed by flow cytometry. Immature dendritic cells were incubated 
with 4 uM SSL7 or SSL9, or LPS and PG and the surface phenotype of these 
dendritic cells was analysed after 18 hours of culture. Neither SSL7 or SSL9 (Figure 

30 7B) induced significant changes in any of the surface molecules measured. In 
contrast, dendritic cells incubated with either LPS or PG up-regulated surface 
expression of HLA-DR, HLA-ABC, CD86 and CD54. Thus, in summary, exposure 



73 

SDOCID: <WO 20050929 18A2_I_> 



WO 2005/092918 



PCT/GB2005/001084 



of dendritic cells to the SSLs protein did not induce dendritic cell maturation, nor 
indeed any obvious changes in dendritic cell surface phenotype, viability or 
morphology. 

5 The influence of SSLs on endocytosis 

Fluorescein isothiocyanate-labelled dextran (FITC-Dx) is rapidly taken up by 
dendritic cells via the mannose receptor (Sallusto F., et aL, J. Exp. Med., 182:389- 
400 (1995)). To determine whether SSL protein altered antigen uptake function, 
dendritic cells were treated with 4.16 ]iM of SSL7 or SSL9 and incubated for 1 or 18 

10 hours at 37°C. Different concentrations of FITC-Dx (1, 3, 10 and 30jag/ml) were 
added to the cell and incubated for a further hour at 37°C, and FITC-Dx update by 
the dendritic cell was then measured by flow cytometry. Figure 8 shows the results 
obtained with the total cell associated dextran being measured by flow cytometry and 
expressed as mean fluorescent intensity for a minimum of 5000 dendritic cells. As 

15 shown in Figure 8, SSL treated dendritic cell showed rapid update of FITC-Dx, and 
neither protein had any effect on endocytic activity. 

Tlie influence of SSLs on the T-cell stimulatory capacity of dendritic cells 

To evaluate the effect of SSL proteins on the stimulatory capacity of dendritic 

20 cells in T-cell proliferation, day 6 dendritic cells (10 4 ) were incubated for 1 8 hours 
with different concentrations of SSL proteins (0.42, 1.25, and 4.16 pM). Residual T- 
cells were depleted and the functional assays performed using fresh viable purified 
autologous or allogeneic T-cells (2 x 10 5 ). The ability to induce secondary immune 
responses was unchanged. Figure 9 A shows representative experiments eliciting 

25 recall responses to tuberculin (PPD - 500 U/ml)). There was no statistical difference 
in the proliferative responses observed between any of the pre-incubated dendritic 
cell groups with SSL 7 or SSL9 (37182 ± 2036 cpm and 36458 ± 3000 cpm, 
respectively) and the control (36458 ± 6151 cpm) after the 6 days co-culture period 
(PX).05). 

30 The capacity of SSL-treated dendritic cells (10 4 ) to elicit primary T-cell 

proliferation also was tested in an allogeneic mixed lymphocyte reaction (MLR). The 
same concentrations of SSL 7 and 9 and number of T cells used in the autologous 
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assay were emplyed with the allogenic T cells. A similar result was observed 
(Figure 9B), as proliferation response against allogeneic T-cells (25710 + 1 140 cpm) 
was unaffected by the treatment of SSL7 or SSL9 (26149 ± 3674 and 25816 ±3159 
cpm, respectively (P>0.05)). Therefore, SSL proteins have no effect on the ability to 
5 induce proliferation of allogeneic T-cells. In general, the results of these 

experiments demonstrate that the antigen presentation capacity of dendritic cells 
remains intact in the presence of these secreted proteins. 

T-cell responses to SSLs in the nonnal human population 
10 To investigate the ability of SSLs to stimulate a recall T-cell response in 

healthy volunteers, dendritic cells (10 4 ) were incubated with autologous T-cells (2 x 
10 5 ) from normal donors in the presence of different concentrations of SSL proteins 
(0.42, 1.25, and 4.16 jaM) for 6 days (Figure 10). Purified protein derivative (PPD) 
was used as a positive control. A recall response against SSL7 was documented in 
15 2/8 individuals and a response to SSL9 in 3/8 individuals. All volunteers showed a 
good recall response against PPD (70889 ±3146 rpm). 

The supematants of three dendritic cell/T-cell/SSL co-cultures (individuals 1, 
2 and 8 from Figure 10) were tested for EFN-y (TH1), IL-13 (THE) and IL-10 (Treg) 
after 4 days. All cytokine levels were low (IFNy < 700pg/ml n=3); IL-13 < 50 
20 pg/ml, n=3) or undetectable (IL-10). The results for the individual with maximum 
response (individual 1 in Figure 10) at different SSL concentrations are shown in 
Figure 11. 

Antibodies responses to SSLs in the normal human population 
25 In order to see if the presence of a T-cell response correlated with antibody 

production, sera from ten individuals (including those tested for T-cell responses as 
shown above) were tested by ELISA against immobilized SSL7 and SSL9 (Figure 
12A). Sera was diluted 1 :2000 and tested for binding to SSL7 or SSL9 by ELISA as 
described in the materials and methods section. Nine out of time individuals tested 
30 showed antibody responses to both SSL7 and SSL9 at this dilution. Interestingly, 
competitive ELISA (Figure 12B and C) showed that the antibody response was 
highly specific for individual SSL isotypes. Increasing concentrations of SSL7 
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(Figure 12B) were able to completely block the interaction between SSL7 protein 
and SSL7 sera. In contrast, neither SSL9, nor an unrelated bacterial protein 
(Embp32) had any effect on the SSL7 antibody binding. Conversely, only SSL9, but 
not SSL7 or Einbp32, were able to block SSL9-antibody (Figure 12C). Therefore, 
5 interaction between SSL proteins and SSL-antibodies were specific and do not cross- 
react. 

Discussion 

10 

Dendritic cells as professional antigen presenting cells have a key role in the 
initiation of the immune response against microbial infections; therefore, many 
microbial strategies have been described which interfere with dendritic cell function 
(Moll EL, Cell Microbiol., 5:493-500 (2003)). One possibility was that SSLs might 
1 5 interfere with normal function of dendritic cells and therefore impair the protective 
immune response to S. aureus. Such functions have recently been proposed for the 
anthrax lethal toxin (Agrawal A, et ah, Nature, 424:329-34 (2003)) and E. coli heat 
labile toxin (Petrovska L, et ah, Vaccine, 21 : 1445-54 (2003)). The possibility of 
SSLs inhibiting dendritic cell function was therefore ruled out in the present study. 

SSL7 or SSL9 were shown to be non-toxic to antigen presenting cells and did 
not alter the characteristic morphology of these cells (cf the effect of Clostridium 
difficile toxin B, (Swetman CA et ah, Eur. J. Immunol., 32:2074-83 (2002)). 
Conversely, SSL7 and SSL9 did not induce process extension, or up-regulation of 
cell surface co-stimulatory and HLA molecules on the dendritic cell, two 
characteristic signs of activation/maturation responses induced by whole S. aureus 
(Tourkova IL, et ah, Immunol. Lett., 78:75-82 (2001)) or bacterial surface 
components such as peptidoglycan (PG) (Michelsen KS, et ah, J. Biol. Chem., 
276:25680-6 (2001)). Thus, although SSLs bind to and are taken up by dendritic 
cells (Figure 6) this interaction does not appear to engage activating receptors on the 
dendritic cell surface. Previous studies (Williams RJ, et ah, Infect. Immun., 
68:4407-15 (2000)) indicating that SSLs could activate high levels of inflammatory 
cytokine release from peripheral blood cells could not be repeated using the highly 
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purified protein preparations used for this study (data not shown) and may have 
resulted from trace amounts of contaminating LPS. 

In addition to their specialized dendritic morphology and cell surface 
phenotype, dendritic cells are characterized by extremely rapid endocytosis by both 
5 fluid phase and receptor mediated update (Swanson J A, et aL, Trends Cell BioL, 
5:424-8 (1995) and Levine TP, et aL, Adv. Exp. Med. BioL 329:1 1-5 (1993)). The 
uptake of FITC-Dextran, which is believed to be mediated via mannose receptors on 
the cell surface (Sallusto F, et aL, J. Exp. Med., 182:389-400 (1995)) is frequently 
used to measure the latter. Dendritic cells did indeed show efficient internalization of 

10 FITC-Dextran (albeit slightly less well after overnight culture) and this uptake was 
not altered by exposure to SSL7 or SSL9. Finally, since dendritic cells are 
distinguished by being the most potent stimulators of both primary and secondary T- 
cell responses, we tested the effects of SSL7 and SSL9 exposure on dendritic cell 
function directly. Although dendritic cells stimulated powerful proliferative 

15 responses to both PPD (a classical recall secondary response to BCG vaccination) 
and allogeneic purified T-cells (predominantly a primary response) neither SSL7 nor 
SSL9 altered the antigen presentation activity of dendritic cells. Taken together, 
therefore, these data do not provide any evidence that SSL proteins inhibit or modify 
dendritic cell function. 

20 Further experiments demonstrated that SSL targeted to antigen presenting 

cells, in particular dendritic cells, actually are delivered to the antigen presentation 
pathway. This can hence enhance an immune response to these proteins and can also 
be used to deliver chosen antigens to the same pathway. The immune response to 
SSLs was analysed in a small panel of healthy human volunteers. Although none of 

25 the individuals tested had any known history of clinical S. aureus infection, the 
organism is extremely prevalent in the environment and approximately 30-40% 
individuals are persistently colonized by S. aureus, usually in the nasal mucosa (Nair 
SP, Williams RJ, Henderson B, Advances in our understanding of the bone and joint 
pathology caused by Staphylococcus aureus infection, Rheumatology, (Oxford), 

30 2000; 39:821-34). Indeed, in eight volunteers tested, three (37%) showed a dose 

dependent T-cell response to dendritic cells loaded with SSL9 and two to SSL7. The 
response was detectable with relatively large numbers of T-cells/well (2 x 10 5 ) and 
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induced the release of a very low level of either TH1 (IFN-y) or TH2 (IL-13) 
cytokines, suggesting precursor frequency of T-cell specific for SSL was likely to be 
low. 

Despite a low T-cell precursor frequency, the humoral response to SSL7 and 
5 SSL9 was robust Using solid phase ELISA nine out often sera tested showed 

specific antibody binding to both SSLs. The response measured was IgG (using an 
anti-IgG detection antibody) suggesting that class switching had occurred and further 
implicating the activity of SSL-specific T-cells. Interestingly, some individuals (e.g. 
individuals 2 and 7 in Figures 10 and 11) show antibody responses, but no detectable 

1 0 T-cells responses, perhaps because precursor T-cell frequency has fall below 

detectable levels in these individuals. One individual showed a T-cell response to 
SSL9, but no antibody response to either SSL tested, although we cannot rule out 
that some antibody might be detectable at a lower dilutions. Interestingly, the 
antibody response to each SSL was highly specific with minimal evidence of cross- 

15 SSL reactivity. This data is consistent with the sequence diversity between SSL 
paralogs, despite a highly conserved three dimension structure (Al Shangiti et al 
(2004) Supra and Arcus VL, et al, J. Biol. Chem., 277:32274-81 (2002)). The 
presence of SSL specific immunity in so many individuals, and the existence of so 
many SSL paralogs in the S. aureus genome, is suggestive of a strong evolutionary 

20 interaction between host immunity and this bacterial family of proteins. 

In conclusion, this study describes a number of functional consequences of 
the interaction between SSL and dendritic cells. In contrast to some other bacterial 
exotoxins (Agrawal et al 9 (2003), Supra and Petrovska et al., (2003) Supra) SSLs do 
not appear to damage dendritic cells, but rather can be taken up by them, and thus 

25 stimulate T-cell response in healthy individuals. This means that SSLs may be 
employed to selectively deliver chosen antigens to antigen presenting cells, in 
particular to dendritic cells. SSLs may therefore be used to help induce an immune 
response or tolerance to a selected antigen. 
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CLAIMS 

1. Use of a complex comprising: 

5 (a) a targeting polypeptide comprising a staphylococcal superantigen-like 
protein (SSL), a fragment thereof or a variant of either, where the SSL 3 
fragment or variant has the ability to target the complex to an antigen 
presenting cell; and 
(b) an antigen and/or a nucleic acid molecule encoding an antigen, 
10 in the manufacture of a medicament for use in immunization or the induction of 
tolerance. 

2. Use according to claim 1, wherein the antigen comprises a polypeptide which 
is present in the complex as a fusion polypeptide with the targeting polypeptide. 

15 

3. Use according to Claim 1 or 2, wherein the antigen and targeting polypeptides 
are not part of the same polypeptide, but are covalently joined to each other or are 
joined through a linker. 

4. Use according to any one of the preceding claims, wherein the antigen is a 
pathogenic antigen, an auto-antigen, an allergen and/or a cancer antigen. 

5. Use according to any one of the preceding claims, wherein the targeting 
polypeptide is present as a dimer. 

6. Use according to any one of the preceding claims, wherein the targeting 
polypeptide comprises: 

(a) a polypeptide having the amino acid sequence of any of SEQ ID Nos 6, 
7, 9, 20, 21, 23, 30, 32, 34, 40, 44, 42, 58, 59, 60 72, 74, and/or76; 

(b) a fragment of any of the sequences of (a), the fragment having the ability 
to target the complex to an antigen presenting cell; and/or 

(c) a variant polypeptide having at least 30% amino acid sequence identity 
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to any of the polypeptides of (a) or (b) and the ability to target the 
complex to an antigen presenting cell. 

7. Use according to claim 6, wherein the targeting polypeptide comprises: 

5 (a) the sequence of SEQ ID No: 7, 9, 21, 23, 32, 34, 40, 42, 59, 60, 74, 76 and/or 
92; 

(b) a fragment of any of the sequences of (a), the fragment having the ability 
to target the complex to an antigen presenting cell; and/or 

(c) a variant polypeptide having at least 70 % amino acid sequence identity 
10 to any of the polypeptides of (a) or (b) and the ability to target the 

complex to an antigen presenting cell. 

8. Use according to any one of the preceding claims wherein the medicament is 
for the induction of tolerance and is to be administered without an adjuvant. 

15 

9 A complex comprising: 

(i) a targeting polypeptide as defined in any one of the preceding claims; and 

(ii) an antigen or a nucleic acid encoding an antigen, wherein the antigen or 
encoded antigen is selected from a pathogenic antigen, auto-antigen, an allergen and 

20 a cancer antigen. 

10. A complex according to claim 9, wherein the targeting polypeptide is present 
as a dimer. 

25 11 A virus comprising a targeting polypeptide as defined in any one of claims 1 
to 8. 

12. A nucleic acid molecule comprising a polynucleotide sequence encoding a 
targeting polypeptide and antigen, wherein the targeting polypeptide and antigen are 
30 as defined in claim 9. 
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13. A nucleic acid according to claim 12, wherein the nucleotide sequence 
encoding the targeting polypeptide and the antigen are present in a single open 
reading frame. 

5 14. A vector comprising a nucleic acid according to claim 12 or 13. 

15. A vector according to claim 14, comprising a promoter capable of giving 
rise to expression of both the targeting polypeptide and the antigen in an antigen 
presenting cell. 

10 

16. A cell comprising a nucleic acid according to claim 12 or 13 or a vector 
according to claim 14 or 15 or infected with a virus according to claim 11. 

17. A method of loading antigen presenting cells comprising contacting 

15 an antigen presenting cell or a precursor thereof with a complex as defined in any 
one of claims 1 to 10 or a virus according to claim 1 1 

18. A method according to claim 17, which is an in vitro method. 

20 19. An antigen presenting cell which has been loaded with a complex as defined 
in any one of claims 1 to 10 or a virus according to claim 1 1 . 

20. A pharmaceutical composition comprising a complex as defined in any one of 
claims 1 to 10, a nucleic acid encoding the targeting polypeptide and antigen of a 
25 complex as defined in any one of claims 1 to 10, a vector comprising such a nucleic 
acid, a cell comprising such a nucleic acid or vector, a virus according to claim 1 1 or 
an antigen presenting cell according to claim 19 and a pharmaceutically acceptable 
carrier or diluent. 

30 21. A vaccine comprising a complex as a complex as defined in any one of 
claims 1 to 10, a nucleic acid encoding the targeting polypeptide and antigen of a 
complex as defined in any one of claims 1 to 10, a vector comprising such a nucleic 
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acid, a cell comprising such a nucleic acid or vector, a virus according to claim 1 1 or 
an antigen presenting cell according to claim 19, 

22, A complex as defined in any one of claims 1 to 10, a nucleic acid encoding 
5 the targeting polypeptide and antigen of a complex as defined in any one of claims 1 
to 10, a vector comprising such a nucleic acid, a cell comprising such a nucleic acid 
or vector, a virus according to claim 1 1, or an antigen presenting cell according to 
claim 19 for use in a method of treatment of the human or animal body by therapy. 

10 23 . Use of a nucleic acid encoding the targeting polypeptide and antigen of a 

complex as defined in any one of claims 1 to 10, a vector comprising such a nucleic 
acid, a cell comprising such a nucleic acid or vector, a virus according to claim 1 1 or 
an antigen presenting cell according to claim 19 in the manufacture of a medicament 
for use in immunisation. 

15 

24. A method of immunising a subject, the method comprising administering an 
effective amount of a complex as defined in any one of claims 1 to 10, a nucleic acid 
encoding the targeting polypeptide and antigen of a complex as defined in any one of 
claims 1 to 10, a vector comprising such a nucleic acid, a cell comprising such a 

20 nucleic acid or vector, a virus according to claim 1 1 or an antigen presenting cell 
according to claim 18 to a subject. 

25. An agent for immunising a subject, the agent comprising a complex as 
defined in any one of claims 1 to 9, a nucleic acid encoding the targeting polypeptide 

25 and antigen of a complex as defined in any one of claims 1 to 9, a vector comprising 
such a nucleic acid, a cell comprising such a nucleic acid or vector, or an antigen 
presenting cell according to claim 17. 
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SEQUENCE LISTING 



S. aureus strain N315 taken from GenBank 



SEQ ID no: 1 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



143101 
143161 
143221 
143281 
143341 
143401 
143461 
143521 
143581 
143641 
143701 
143761 
143821 
143881 
143941 
144001 
144061 
144121 
144181 
144241 
144301 
144361 
144421 
144481 
144541 
144601 
144661 
144721 
144781 
144841 
144901 
144961 
145021 
145081 
145141 
145201 
145261 
145321 
145381 
145441 
145501 
145561 
145621 
145681 
145741 
145801 
145861 
145921 
145981 
146041 
146101 
146161 
146221 
146281 
146341 



tatgaaattt 
aattacatcg 
agagttgaaa 
atatactgaa 
ttctttagtt 
gtttatcctt 
aaaaacaaac 
aggtaaagaa 
agaacttgat 
tcttaaacaa 
aagtcaaaaa 
aattctagta 
gttatggaaa 
aaaagttaca 
aatcatattt 
aatatgggtg 
tacaactatg 
aggtgtaaac 
tgaaaatagc 
gacaggttat 
ttctgttaat 
tcgtaatctt 
tattgggggt 
tattaaagca 
agaagcgatg 
tggtctttac 
gtatacattt 
cacagatatt 
gaaataattt 
tttaatgctt 
tcattatttt 
gtgattatct 
aatcgaatat 
agtttagcac 
gcagaaaaaa 
gcaatgataa 
caagaacgca 
tccaaaatag 
gcaacgccag 
actaaagtga 
gacacaccac 
gatttaagag 
ctcaaaccat 
atagctttag 
gtatttatcg 
acgaagacta 
aatcaaggta 
ttgaaagagc 
aacatgggtt 
ttacacaaaa 
aaaattgaag 
aacaagaagt 
atgtaaaaga 
gaatcctcaa 
atggatttct 



aaagcgatag 
aatgtacaat 
cactattata 
aaaggtaaag 
ggatctgata 
agagaaggtg 
agtcaacctt 
gaaccacaaa 
tatagattaa 
ggtcaaatta 
cttgaaaaag 
gaaatgaaat 
tatatggaag 
aaactcacac 
tatgaatagt 
tatggttcaa 
aaaatgaaaa 
actacaacgg 
aaaaaattaa 
atcagtttca 
aaccttgctt 
aatatatttt 
atcactagtg 
gatcatattg 
tcattgaaag 
ggtgaaatga 
gaattggata 
gatagaattg 
gaaattgaaa 
aaaaatcatt 
ttgcttaaat 
tagaacgcca 
aatatagatt 
tagggctttt 
tacaatcaac 
acataacagc 
cgcctaaact 
aaaaaatatc 
cgcctaaaca 
caacacctcc 
aatctccaac 
cgtattacac 
ggacgacggt 
ttggaaaaga 
ttttagaaga 
atagtaaaaa 
tgatttcacg 
ttgattttaa 
caggaacaat 
aactgcaaga 
tgaatataaa 
taagtgacaa 
cgaatattca 
atgtgccaag 
taatttactt 



caaaagcaag 
cagtacaagc 
ataaaccggt 
attatataga 
aagacaaatt 
acagtagaca 
ttattgacta 
gtagtttata 
gagaacgtgc 
caattacaat 
aacgtatggg 
aatactttct 
ttaagcgacg 
aaacagtcgc 
taaaaacagg 
attacgtaat 
atattgcaaa 
aaaaaccagt 
aagcttatta 
ttcaaccaag 
taattggcaa 
acgttaatga 
caaacgataa 
gtgaatatga 
agattgattt 
gtacagggaa 
aaaagttaca 
aaatcaaagt 
tagagaggtt 
tcaaaggcac 
tacttaataa 
tctataatga 
ggagtataca 
aacaacaggc 
taaagttgac 
aggtgcaaat 
cgaaaaggca 
acaacctaaa 
agaacaatca 
atcaacaaac 
cataaaacaa 
gaaaccgagt 
taggtttatg 
tgagaaaaaa 
caataaatat 
agttgatcac 
cgatgtttca 
attgagaaaa 
cgttattaaa 
gcatcgtatg 
ataatcatga 
cggtttacat 
tttgtttgta 
tgttgaatca 
aacgatgatt 



tttagcattg 
gaaaacagaa 
tttagagcgt 
tgttatagta 
taaagatgga 
agcaacaaat 
tatacacaca 
ccaaatttat 
aatcaaacaa 
gaaagatggc 
tgattctatc 
aacaacaaag 
tactgttgct 
accacgcatt 
ttaatgtgaa 
aaaacaatct 
aataagtttg 
tcatgccgaa 
tactcaacct 
tattaaattt 
agataagcaa 
ggataagaga 
agctgtcgac 
ttatgacttt 
taaattaaga 
aattaccgtc 
agaagaccgt 
tagaaaagct 
aagtgacgat 
atagaaacgc 
tacttcaata 
tgttgtatga 
attatgaata 
gcaattacag 
aaagtaccaa 
tcagcgacaa 
ccaaatacta 
caagaagagc 
caaacgacaa 
acgccacaac 
gcacaaacag 
tttgaatttg 
aatgttattc 
tataaagatg 
caattaaaaa 
aaagcagaat 
gaatacatga 
caacttattg 
atgaaaaacg 
gcagatgtca 
cgttctctaa 
gttgcttagc 
aaagtggcat 
catcaaaatc 
caaatatagt 



ggaatgttag 
gttaaacaac 
aaaaatgtta 
gacaatcaat 
gacaactcga 
tactcaattg 
ccaatccttg 
aaagaagaca 
cacggcttgt 
aaatcacata 
gacggcagac 
cgctatgttg 
tagcttcttt 
atcttttgct 
tatccgaata 
aattataata 
ttattaggaa 
aagaaaccta 
agtattgaat 
atgaatatca 
cattatcata 
tttgaaggtg 
ctaatagcag 
ttcccattta 
aaatacctta 
aaaaagaaat 
atgtccgatg 
taatacacat 
caaacgttgc 
tatattaacc 
attgttaaaa 
ttcaaattac 
tgaaaacaat 
taacgacgca 
cgcttaaagc 
cacaagcagc 
atgaggaaaa 
agaaatcgct 
ccgaatccac 
caatgcaatc 
atatgactcc 
aaaagcagtt 
caaataggtt 
gaccttacga 
aatattctgt 
taagcgttac 
ttactaagga 
aaaaacataa 
gtgggaagta 
tagaaggtac 
atagaagctg 
ttcttttatt 
ttctatgtct 
agttttattt 
taaacaaggt 



caacaggtgt 
aaagtgaatc 
ctggatataa 
attctcaaat 
atatagatgt 
gtggcgtaac 
aaatcaagaa 
tctcattgaa 
attcaaatgg 
ctatcgattt 
aaatacaaaa 
aatagtgctt 
ttttgagggg 
taaatagctt 
cagctcctat 
gattggagca 
tattagcaac 
ttgtaataag 
ataaaaatgt 
tagatggtaa 
cgggtgtaca 
caaagtactc 
aagcaagagt 
aaatagataa 
ttgataatta 
actatggaaa 
ttatcaatgt 
acttgacgac 
ttaacttctt 
tcataatcac 
agggtttaat 
gtaaaaagac 
tgctaaaacc 
atcggtcaaa 
agagcgatta 
taacacaaga 
aacctcagct 
taatatatca 
aacgccgaaa 
tactaaatca 
taaatatgaa 
tggatttttg 
catctataaa 
taatatcgat 
cggtggcatc 
taaaaaagat 
agagatttcc 
tctttacggt 
tacgtttgaa 
aaacattgat 
acatcggtaa 
atgcgtaatg 
taaaagtgac 
aacgaacatt 
ttaatgtgaa 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



146401 

146461 

146521 

146581 

146641 

146701 

146761 

146821 

146881 

146941 

147001 

147061 

147121 

147181 

147241 

147301 

147361 

147421 

147481 

147541 

147601 

147661 

147721 

147781 

147841 

147901 

147961 

148021 

148081 

148141 

148201 

148261 

148321 

148381 

148441 

148501 

148561 

148621 

148681 

148741 

148801 

148861 

148921 

148981 

149041 

149101 

149161 

149221 

149281 

149341 

149401 

149461 

149521 

149581 

149641 

149701 

149761 

149821 

149881 

149941 

150001 

150061 

150121 

150181 

150241 



tggagcaata 

taataattac 

gcactaggcc 

acaccatctt 

gcaccgcaat 

caaacaccga 

acaaaacaag 

aaaccaagtt 

agatttatga 

gataaaaaat 

aaaaataaat 

aaagttgatc 

catgatgttt 

aaattgagaa 

attgttatta 

gaaaatcgca 

aaataatcat 

aacggcctac 

tcgatgggtc 

gggtgtgaag 

ttaataatga 

ataataaagc 

catacaacta 

acaggaacaa 

gaaaatgtga 

cttaaaaatg 

aaaaatagaa 

cgcaaaaatc 

gtgttttcat 

gcaccaagat 

cactacattt 

ttaattcaaa 

atgaaagatg 

agtgatgtca 

gaaatatgga 

agcttctttt 

aataaaactt 

aacaatgtgg 

gtgttataaa 

aaatagttaa 

tgttcgtatt 

agattgggag 

cttattaact 

agagagagta 

ttttgaattc 

ctttaaccaa 

taaaaaaggc 

tagactatct 

acatttattt 

aattaataaa 

tgaaaaatat 

agacgagaaa 

tgtgttgaat 

acaatcaatg 

tacgtacatc 

gttttataac 

attgaatatt 

gtcagtctgt 

caatccaata 

atatttgtat 

gcaaaaggaa 

tattcaggac 

tctaacgttt 

gatgaaaata 

gtagatttag 



cgccatctat 
gaatggagca 
ttttaacaac 
ccactaaagt 
caaaaccaaa 
acgcgacaac 
taccaacaga 
tagaatttaa 
atattgttcc 
atgatgaagg 
atggagtgga 
acaaagcagg 
cagaattcaa 
aacaacttat 
acatgaaaaa 
tggcagacgt 
gacattctct 
atgttgctta 
caaatatgac 
cacaacggaa 
ttcaatgatt 
tgtatgattc 
tgaaaatggc 
taacgtcatt 
caaaagatat 
ttactggtta 
aattcacaag 
cgggattaga 
atggtggtgt 
ttcaaatcaa 
ataaagaaga 
attttgatct 
gcggctatta 
ttgacggtag 
taatagtaaa 
ttgtgttggc 
gtggaaatag 
aaaacataat 
aaataattaa 
aaagaggtta 
acgtaattga 
aatagtacta 
actggtgtga 
caacatttat 
agtaatatta 
gaaaaacaaa 
cttgaaggcc 
actgttggtg 
gttaataaag 
gaagaagttt 
ggtttatata 
aaggaagtaa 
agtaaggata 
actttaaagt 
ctccaaaaag 
gcagggtatg 
aattagttct 
ctcaatgccc 
tattaagatt 
taggaatatt 
agtatgaaaa 
ctagttatga 
tgctttttaa 
aatacaaaga 
atggaagaat 



aataaagctg 
tacaactatg 
gggtgtaatc 
ggaagcacca 
cgcgacaaca 
accatcttca 
aataaatcct 
aaatgagatt 
agattatttc 
agtacatagg 
aagatactcg 
agtaagaatt 
gattactaaa 
tgaaaatcat 
cggtggaaag 
catagatggc 
aaatagaagc 
gcttcttttg 
gtggaagagt 
tcagttttat 
attaaagatg 
aatagacgta 
agcaattgcg 
gcatcaaact 
ctttgactta 
tcgttatagc 
agtacagata 
catatttgtt 
cactaagaaa 
gagagatgaa 
gatttcactt 
gtataaaaag 
tacgtttgaa 
aaatattgaa 
atatggatag 
gagatgaaaa 
ttgatactta 
taaattgagg 
tactgttagg 
attcatagcg 
attaatcata 
tgaaattaaa 
ttacatcaga 
atgatattaa 
gtggtaaggt 
atcaccaatt 
agaatgtctt 
gtgtgactaa 
tgtatggcgg 
cactgaaaga 
aaggtacgac 
ttgatttagg 
ttcaaaatat 
aataaatttg 
gggcgtatct 
agcgtactaa 
tcattaacca 
tttataataa 
ggagcatatg 
aacaacaagt 
aatgaaccgt 
gttaacaaat 
ccaacaaaat 
aaaaacacat 
atttagtgtt 



tatgattcaa 
aaaataacaa 



acaacgacaa 
caatcaacac 
ccgccttcaa 
actaaagtgg 
aaatttaaag 
ggtattattt 
atatataaaa 
aatgtcgatg 
gtcggtggta 
actaaagaag 
gagcagattt 
aatctgtacg 
tacacgtttg 
actaatattg 
tgtcatcgga 
ttatgttcga 
cctgaattta 
ttaacgaaca 
gtttaatgtg 
agcgaacaaa 
aaagcaagtt 
gtaaatgcga 
agagattact 
aaaggtggca 
tttggtaaag 
gttaaagaag 
aatcaagacg 
ggtgacggta 
aaagaactcg 
tttcctaaag 
cttaataaaa 
aaaatagaag 
tatagaggag 
tgaagcgtat 
tagatgcgtg 
gaaagtgtga 
atttcattaa 
cagtatctcg 
taaaaatata 
aacgttagct 
aggccaagca 
agacttatat 
tgaaaactat 
attcttatta 
tgtggtaaaa 
gaaaaataac 
aaatttagat 
acttgatttc 
taaatacggt 
tgataaactg 
agcagtgact 
aagcagctta 
aaatcaacag 
aaattcacat 
tgatttaatt 
atgtgtatta 
aatatgaaat 
gtaatgataa 
ttatatgata 
gttagtggcc 
caaaagttcc 
ggtttagatg 
agtggtgtaa 



tgaatgtaat 
caattgctaa 
cgcaagaagc 
cgccctcaac 
ctaaagtgga 
aaacaccaca 
atttaagagc 
taaaaaaatg 
tagctttagt 
tatttgtcgt 
tcacaaagag 
ataataaagg 
ccttgaaaga 
gtaacgttgg 
aattacacaa 
ataacattga 
aaaacaagaa 
tgatttgaga 
tctgtaaatc 
ttatagattc 
aaaggtcaaa 
tctaataatt 
tagcattagg 
gtgaacatga 
atagtggcgc 
agcattacct 
atattgaaag 
cagaaaatcg 
cttattatga 
ttgctacgta 
actttaaatt 
atagtaagat 
aattacaaac 
ccaacattag 
ttaggcaaca 
cgatgaataa 
atgtcgcttt 
atagttaaaa 
ctaacttaac 
cttatataat 
ttaagacaaa 
aaagcaacat 
gtccacgcaa 
cgatactact 
aacggttcta 
ggaaaagata 
gaattaattg 
aaatcttctg 
gcatcaattg 
aaaattagaa 
aagatcacta 
caattcgagc 
attaatcaaa 
acgatgaaat 
tgtcgttagg 
tacttctgaa 
ttaattaaac 
ttcaaattac 
ttacagcgat 
cagaaaatca 
caaacaagtt 
aaagtcaagg 
aagtattttt 
tctttgcggt 
cgaagaaaaa 



cgaacaaatc 

aacaagttta 

aaacgcgaca 

taaagtagaa 

aacaccgcaa 

atcgccaacc 

gtattatacg 

gacgacaata 

tggaaaagac 

tttagaagaa 

taatagtaaa 

tacaatctct 

acttgatttt 

ttcaggtaaa 

aaaattacaa 

agtgaatata 

gttcagtgac 

acccgaattt 

cctatctatc 

cttaatttac 

tacgccaatt 

acgaatggag 

tattttagca 

agcaaaatat 

aagtaaggaa 

tatctttgat 

atttaaagca 

caacggcaca 

ttatataaac 

cggtagagta 

gagacagtat 

aaaagtgata 

aaatcgcatg 

ataattcaat 

taagttgctt 

taaaaacacc 

agtgacatga 

aattagcatt 

gttggttcaa 

gatagtagat 

atttataaat 

tggcattagg 

aagaaaagca 

catcagaaag 

acgttgtacg 

aagataaata 

atccaaacgg 

aaactaatac 

actcattttt 

agcaattagt 

tcaatttgaa 

gcatgggtga 

tttaaattaa 

gttgaataaa 

ctgtttttat 

agtgatgtcc 

gagtgttaat 

gtaataaaag 

agctaaagcg 

atcggttaat 

acatcaatac 

ttattatgac 

attgggaaaa 

accagaatta 

cgtaaaatca 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



150301 
150361 
150421 
150481 
150541 
150601 
150661 
150721 
150781 
150841 
150901 
150961 
151021 
151081 
151141 
151201 
151261 
151321 
151381 
151441 
151501 
151561 
151621 
151681 
151741 
151801 
151861 
151921 
151981 
152041 
152101 
152161 
152221 
152281 
152341 
152401 
152461 
152521 
152581 
152641 
152701 
152761 
152821 
152881 
152941 
153001 
153061 
153121 
153181 
153241 
153301 
153361 
153421 
153481 
153541 
153601 
153661 
153721 
153781 
153841 
153901 
153961 
154021 
154081 
154141 



atatttgagt 
ttttctattg 
tttaaaataa 
ggtagaattg 
ttagatttcg 
aatttgaaat 
gtttaatgat 
taagctgctt 
aacaatgatt 
aaatacgttt 
tctgttttga 
caatatatca 
agcattagga 
agtagaactt 
tgaagaaagt 
cgttttaaac 
aaataaatat 
tataaaaggt 
tggatttgta 
gataaatgag 
aataagaaaa 
aatcgttatt 
ttttgatcgt 
gaattagttt 
tgacatctgt 
ttgtaattca 
aaacaactaa 
cattaagaat 
ttataataaa 
gagcaaataa 
acaacaggaa 
gtaaataaac 
aaaaatatta 
aagattcaag 
gggttagatg 
ggtggtgtaa 
tcaaaagaaa 
aaaataacac 
ctttataaaa 
tataaccttg 
aaacaaatta 
ttgaagcggc 
tcaatcgata 
catgaatata 
gttataataa 
tgtctattac 
ttgcgaatga 
tgattttcta 
gtgaagatat 
ctgaattaat 
ttcgtgaaat 
ttgaaacatc 
tggacttaag 
aagttatggt 
taggtgatgc 
gcgagttcta 
aagataaatt 
ttggtaaaga 
acttagcgcg 
atgatgatac 
acccaccata 
gtggttacgg 
tacattacct 
gtggtgccgc 
ccgtgattgg 



ctctaagaac 
atgaattttt 
gaaaactgtt 
ttattaatat 
agcgtatggc 
aatcaatgat 
tttgatacgt 
ttttgtacac 
cccccaaaaa 
gattttcatt 
tgcaccttat 
agattggagc 
atattaacta 
gatgagacac 
tttgaatcaa 
tttaaccaac 
aaagaaaaaa 
ggcatatata 
agtaatccaa 
ttgttcttta 
atgttagtcg 
aatatgaaag 
atgtttgatg 
gagttaatag 
gtccttatat 
aagagcagaa 
atgaaataat 
aattcaagta 
aatgtatgat 
atatgaaatt 
ctttaacaac 
atgacaagga 
gtgctttgaa 
ttttactgcc 
tgttttttgt 
tacagaataa 
agggtgaaga 
taaaagagct 
caatctcaaa 
atttaagatc 
aagatattga 
ttaacggtga 
tcgtcgttaa 
aaaattccac 
atctattaca 
tgaaaaacaa 
tttaagaggg 
tcgcttctta 
tacgtatcaa 
tgatcaagtc 
tgaaacgcaa 
aacactaggt 
ttcaacgcga 
taaccttgac 
atacgaattt 
tacaccacaa 
acgtcacgtg 
aacgcaagtg 
catgaacatg 
gttggaaaat 
cagtgcgaaa 
caagcttgcg 
agacgatgaa 
agaaggcgtc 
cttaccagtg 



gccgaactta 
ctttattcaa 
gattaaaaaa 
gaaagatgaa 
agatgtcatt 
atatagaata 
gttttaataa 
tttgtatcga 
aatttatgtt 
aataatgatt 
aataaagaca 
aaataaatat 
caggtgtgtt 
aacgcaaata 
caaacattag 
gaaataaaac 
cacatggcct 
gcgttggcgg 
gtctacaagt 
ttcaaaagga 
aaaaatatag 
acgaaaagaa 
taatggatag 
cataatagct 
aaggaactgt 
cagagtaaca 
gaaagtcatt 
tatttaaatc 
tcaaattacg 
tacagcatta 
agaagttcat 
ggcactatac 
acatggtaaa 
tggaaatgat 
tcaagaaaaa 
taaaacatct 
tgcttttgtg 
ggattataag 
agatggtagg 
taaattaaaa 
agttaactta 
aatgtaaatt 
gccgtttttg 
caccaacatc 
caaagagata 
cgtcagcaac 
aatatggatg 
tctgaaaaag 
gaagcatggg 
ggttacttca 
gatttcgata 
gaagaaagtg 
ctaggtaaca 
gacttaccat 
cttatcgggc 
caagtatcta 
tatgacccaa 
tatcgttatt 
ttattacatg 
ccagctttct 
tggactgcag 
ccaaaatcca 
ggtaccatgg 
attcgtcgtt 
aatattttct 



ctagttaaaa 
aaggaagaag 
tacaaactgt 
aataagtatg 
aatagtgaac 
aaagcttaag 
taaaaacata 
ataacttaag 
gctattaaaa 
caagtttatt 
gatagttcaa 
gaaattgact 
tacagcagaa 
ttatatcaat 
cgttaaaagt 
tttcaaagtg 
tgatgtcttt 
tataacaaag 
taaaaaagtt 
agaagtatcg 
attgtataaa 
atatgtaatt 
taagcaaatt 
taagaagcga 
gttaaataca 
tcatcagttg 
taacctgaac 
gaggttaatt 
taatgaaaac 
gcaaaagcaa 
tcaggtcatg 
cgatactaca 
aataacttgc 
aaaagtaaat 
agagataagc 
ggagttgtca 
aaaggttacc 
ttgagaaagc 
gtcaaaatta 
tttaaatata 
aagtaaataa 
ggtgcgcata 
tttgtgtgtc 
aaaattctcc 
aattacttat 
aagctgaatt 
cgagtgaatt 
cggaacaaga 
cagatgaaga 
ttgaaccaca 
tcgaacatct 
aaaatgactt 
atgtcaaaga 
tcgttcacag 
gctttgcggc 
agatactggc 
catgtggttc 
tcggtcaaga 
atgtgcgcta 
taggcaatac 
attcaaaatt 
aagcagactt 
cagttgtact 
atttaattga 
atgggacaag 



aaatagacga 
tgtcattgaa 
atgaagggtc 
aaattgattt 
aaattaagaa 
aagcggttta 
tcgaacattg 
atctaaaact 
atcagttaat 
taaatgagcg 
attacgtaat 
gcattagcaa 
agtaaagctg 
atgctacatc 
gaagattact 
tttctgcttg 
gcagtacctg 
aaaaatgtga 
gatgctaaac 
ttgaaggaac 
ggcgcgtcag 
gatttaagtg 
aaaaatattg 
cttaacgaca 
ttactgttgt 
tagtaaacga 
attaaaatat 
atcgtatgaa 
aatccaatat 
cattagcttt 
caaaacaaaa 
ctggaaagac 
gttttaagtt 
ttcaacagcg 
acgatatatt 
gtgcaccaat 
cttattacat 
atctaattga 
gcttgaaaga 
tgggggaagt 
ttacgaataa 
gcttatacaa 
atgaatccta 
acatcgcaac 
tcaaaggcgg 
acataaaaaa 
ccgtaattac 
atatgcagat 
atatcgtgaa 
agatttattc 
cgcaacggcg 
tatcggactg 
acgtactgca 
tgatatggaa 
gacagcaggt 
gaagattgtc 
aggttcatta 
acgtaacaat 
tgaaaacttt 
atttgatgcg 
tgaaaatgat 
tgcctttatt 
cccacatggt 
agaaaagaac 
tattccaaca 



taaagacggt 
ggaacttgat 
agctgataaa 
aagtgataaa 
catcgaagtg 
ataatcccat 
actacgttat 
aatcggaaag 
acgaatgtta 
ttaatgtcag 
cataacaatc 
aagcaacatt 
ttcacgcgaa 
aatactattc 
atggttctaa 
gtgacgataa 
aattaataga 
gatcagtgtt 
atggcttttc 
tggattttaa 
ataaaggtag 
aaaaattaag 
aagtgaattt 
aaatgtgaat 
taagttgttt 
taatccagta 
atttgttttt 
acgatgcacg 
attaagattg 
aggaatttta 
tcaaaagtca 
tatggaaatg 
tagaggtatt 
tagttatgag 
ttatactgtt 
attaaatatt 
taaaaaagaa 
aaaatacgga 
tggcagtttt 
catagaaagc 
taaaaagtaa 
aaaggatgca 
tcccaatctc 
ataaccaaat 
aggaatcaca 
ttatggtcga 
attttaggct 
gccttgtcag 
gacttaaaag 
agtgcgatga 
attcgtaaag 
ttcagcgata 
ctaatttcca 
attgatatgt 
aaaaaagcag 
acagacggta 
ttgttacgtg 
accacttaca 
gatatccgta 
gttatagcga 
gaacgcttca 
caacatatgg 
gtcttattcc 
tacttagaag 
tgtatcttag 
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154201 

154261 

154321 

154381 

154441 

154501 

154561 

154621 

154681 

154741 

154801 

154861 

154921 

154981 

155041 

155101 

155161 

155221 

155281 

155341 

155401 

155461 

155521 

155581 

155641 

155701 

155761 

155821 

155881 

155941 

156001 

156061 

156121 

156181 

156241' 

156301 

156361 

156421 

156481 

156541 

156601 

156661 

156721 

156781 



tatttaaaaa 

ttgaaaaagg 

catacaagcg 

ccgataacga 

caattgattt 

ttgaacaaga 

caaaagaaaa 

cagttggggg 

cctttaacaa 

gtttcgtcga 

aaaagttatt 

ggtgtattgt 

atggaagcat 

ggtgcaagaa 

aaatatccaa 

caaattgaat 

cagaaaattt 

tgggaaaata 

caaatgcttt 

gataattcaa 

aattctatga 

cctgcatata 

tttaaaacac 

acatggaact 

gaacaagaaa 

atgaaaattg 

ctttgataaa 

aaaagtaaag 

ttttgtgtcg 

gttattcgta 

attaaattac 

attgctcatt 

taaaaattgg 

agggatttta 

agaggttaga 

attaacaaat 

acaaattgat 

ttctaatgtt 

gattggtgga 

tttgtcagtt 

catctataaa 

tgataaacat 

aaacggtggc 

tacgattgat 



atgtcgccaa 

aaaaaatcaa 

taaagaaaca 

ttacaaccta 

agatcaagtc 

aatcaatgca 

atgtgccaga 

atcttacaga 

tatccggaca 

aaaatctaga 

ctaatggata 

cctctttgta 

attttgattc 

atcacggatt 

gtttagaaga 

tagaagaaca 

tctcacagga 

gcaaaataga 

cagtaactat 

gtaaagataa 

gaatgtggca 

ctgtgcttta 

atagaatgat 

taaaatataa 

agataggtga 

aaatattaga 

tacatagatt 

tgaattaaaa 

aaattgtgta 

aataaaagag 

ttaataatga 

ataatgaatg 

gagcatagaa 

acaacaggta 

tcacaagcta 

gtaacaggat 

gttacattga 

gacgtgtttg 

attacaaaga 

tctaagagta 

gaagaaatct 

gatctttata 

tattatacat 

agtagaaata 



caagacgaca 
aaccatttaa 
attgataaat 
aacataccga 
caacaagatt 
tacctgaaag 
gttgagattc 
tagagtaatt 
gttaggttta 
aaattataca 
cccattaggg 
tatttgtttt 
gacacactgg 
attaaatgtt 
acagcaaaaa 
aaagcttgaa 
actgcgattc 
aaaatattta 
aaatagtggc 
aagtaattat 

aggggctagt 

tccaacacaa 
tcataaattt 
acaattaaaa 
tttctttaaa 
aaaagagaaa 
gcataagaat 
acgaacatta 
cagaataagt 
agtagatcga 
ttaattttta 
aggattgttc 
ttatgaaatt 
tgattacaac 
ctcaagactt 
ataaatacgg 
caggaaatga 
tagtaagaga 
caaatgggac 
caggtcaaca 
cattaaaaga 
agacagaacc 
ttgaattaaa 
tagagaaaat 



acgtattatt 

gcgatgccca 

acagttacag 

ggtatgtcga 

tgaaaaatat 

aacttggggt 

ccagggtttg 

aggaaaaata 

attgatcaaa 

ctaataaaga 

gctattaaaa 

tctattaaaa 

tatagagaag 

tctgtgaatg 

ataggcaagt 

ttacttcaac 

aaagatgaga 

aaagagagaa 

attataaaat 

aaagtagtta 

ggtaaatcaa 

aatactagct 

aaaattaatt 

aatataaata 

aaaatggata 

caatcctttt 

aaaatttgta 

aatttaggca 

agttaaataa 

taggaattga 

gttaaagtaa 

gtattgcgta 

aaaaaatatt 

tactgctcag 

gagtgaatat 

aaataaagtt 

aaaattaact 

aggtagtgac 

tcaacataaa 

cactacttct 

acttgatttt 

taaagacagt 

taaaaaatta 

tgaagtgaat 



tatcgatgca 
agtcgaacgt 
tgcgacatta 
tacattcgaa 
cgacaaagaa 
gttgaaagat 
aaggcgaatg 
aaaacttaga 
cagaatattt 
atggagaatt 
gattaactag 
gtgaaatgtc 
tttctggaat 
atttttttac 
tcttcagcaa 
aacagaaaaa 
atggtgaaga 
acgaacgttc 
ttagtgaatt 
ggaaaaatga 
attataatgg 
cattatttat 
cacaaggatt 
tagatatacc 
tattgataag 
tacaaaaaat 
taatttaaca 
ctgtgaaagc 
agattaagtt 
atgatattag 
gtttaatgtg 
atagaataaa 
gctaaagcaa 
ccagtaaaag 
tataaaggga 
acatttatag 
gttaaagatg 
aaatcagcta 
gatactgttc 
gtgacttcag 
aaattaagaa 
aaaattagaa 
cagcctcatc 
ttataataat 



tccaatgatt 

attattgaca 

caagagattg 

gaagaagcgc 

atcgcagaaa 

gagtaataca 

ggaagagaag 

atcgaaaaag 

tagtaaatca 

cgcgtataac 

atatgatagt 

taaagacttc 

tgcagttgag 

tattctaatt 

actcgaccga 

aggctatatg 

ttatccagat 

tgacaaaggg 

ggatagaaaa 

tattgcatat 

gattgttagc 

tggatataag 

aacatcagat 

tgtattggag 

taaacagaaa 

gttcttataa 

taaaagttgt 

gcagtgtctt 

gagataaagt 

ttaactattt 

aagcacgacc 

tcaaatagac 

gtttagcact 

caagtacatt 

gaggatttga 

ataattctca 

atgacgaagt 

tcacaacatc 

aaaacgttaa 

aatactatag 

agcatttaat 

ttactatgaa 

gcatgggtga 

attcgaggga 



SEQ ID No: 2 - SSL1 

50 gene /gene=" (set6) N315.ss.li" 

CDS 143102. .143782 

/gene*=" (set6) N315ssll" 
/note="ORFID : SA0382 
[Pathogenicity island SaPIn2]" 

55 /codon__start=l 

/trans l_table=ll 
/product^" (exotoxin 6) SSL1" 
/protein id=" BAB41610 . 1 " 
/db xref="GI: 13700312" 

60 

/ trans 1 a t i on= " MKFKAI AKASL ALGMLATG VI T SNVQS VQAKTE VKQQSE SE LKH 
YYNKPVLERKNVTGYKYTEKGKDYIDVIVDNQYSQISLVGSDKDKFKDGDNSNIDVFI 
LREGDSRQATNYSIGGVTKTNSQPFIDYIHTPILEIKKGKEEPQSSLYQIYKEDISLK 
ELD YRLRERAIKQHGLYSNGLKQGQI T ITMKDGKSHTI DLS QKLEKERMGDS I DGRQ I 
65 QKILVEMK" 
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SEQ ID No: 3 - SSL2 

5 

gene 144068 .. 144763 

/gene-" (set7) N315ssl2" 
CDS 144068. .144763 

/gene="(set7) N315ssl2" 
10 /no t e- " ORFI D : SAO 383 

[Pathogenicity island SaPIn2 3" 

/codon_start=l 

/transl tableau 

/product^" (exotoxin 7) SSL2" 
15 /protein id=" BAB4 1611 . 1 11 

/db_xref-"GI: 13700313" 

/translation-"MKMKNIAKISLLLGILATGVNTTTEKPVHAEKKPIVISENSKKL 
KAYYTQPS IEYKNVTG YI S FI QPS IKFMNI I DGN SVNNLALI GKDKQHYHTGVHRNLN 
20 IFYVNEDKRFEGAKYSIGGITSANDKAVDLIAEARVIKADHIGEYDYDFFPFKIDKEA 
MSLKEIDFKLRKYLIDNYGLYGEMSTGKITVKKKYYGKYTFELDKKLQEDRMSDVINV 
TDIDRIEIKVRKA" 



25 SEQ ID No: 4 - SSL3 

gene 145054 14 6124 

/gene-" (set8) N315ssI3 M 
CDS 145054 .. 146124 

30 /gene=" (setB) N315ssI3" 

/ no t e- " ORFI D : SAO 384 
[Pathogenicity island SaPIn2]" 
/codon_start=l 
/transl table=ll 

35 /product-" (exotoxin 8) SSL3" 

/protein id-" BAB41612 . 1 " 
/db_xref="GI: 13700314" 

/translation— "MNMKTIAKTSLALGLLTTGAITVTTQSVKAEKIQSTKVDKVPTL 
40 KAERLAMINITAGANSATTQAANTRQERTPKLEPCAPNTNEEKTSASKIEKISQPKQEE 
QKSLNISATPAPKQEQSQTTTESTTPKTKVTTPPSTNTPQPMQSTKSDTPQSPTIKQA 
QTDMTPKYEDLRAYYTKPSFEFEKQFGFLLKPWTTVRFMNVIPNRFIYKIALVGKDEK 
KYKDGPYDNI DVFI VLEDNKYQLKCT SVGGITKTNSKKVDHKAELS VTKKDNQGMI SR 
DVSE YMI TKEE I S LKE LDFKLRKQLIEKHNLYGNMG S GT I VIKMKNGGKYT FE LHKKL 
45 QEHRMADVIEGTNIDKIEVNIK" 

SEQ ID No: 5 - SSL4 

gene 14 6488 147366 

50 /gene-" (set 9) N315ssI4" 

CDS 146488. .147366 

/gene-" (set9) N315ssI4" 
/note-"ORFID:SA0385 
[Pathogenicity island SaPIn2]" 
55 /codon start— 1 

/transl table-11 
/product-" (exotoxin 9) SSL4" 
/protein id=" BAB41613 . 1 " 
/db xref-"GI: 13700315" 



60 



/translation— "MKITTIAKTSLALGLLTTGVITTTTQEANATTPSSTKVEAPQST 
PPSTKVEAPQSKPNATTPPSTKVETPQQTPNATTPSSTKVETPQSPTTKQVPTEINPK 
FKDLRAYYTKPSLEFKNEIGIILKKWTTIRFMNIVPDYFIYKIALVGKDDKKYDEGVH 
RNVDVFWLEEKNKYGVERYSVGGITKSNSKKVDHKAGVRITKEDNKGTISHDVSEFK 
ITKEQISLKELDFKLRKQLIENHNLYGNVGSGKIVINMKNGGKYTFELHKKLQENRMA 



5 



WO 2005/092918 



PCT/GB2005/001084 



DVIDGTNIDNIEVNIK" 



5 SEQ ID No: 6 - S5L5 

147730. .148434 
/gene«"setlO N315ssI5" 
147730. .148434 
/gene«"setlO N315ssl5" 
/note* " ORFI D : SAO 38 6 
[Pathogenicity island SaPIn2]" 
/codon_start=l 
/transl_table=ll 
/product^" (exotoxin 10)SSL5" 
/protein id=" BAB4 1614 .1 " 
/db_xref="GI : 13700316" 

/ 1 rans la t i on= "MKMAAI AKASLALG I LATGT IT S LHQT VNASEHE AKYEN VTKD I 
20 FDLRDYYSGASKELKNVTGYRYSKGGKHYLIFDKNRKFTRVQIFGKDIERFKARKNPG 
LDIFWBCEAENRNGTVFSYGGVTKKNQDAYYDYINAPRFQIKRDEGDGIATYGRVHYI 
YKEEISLKELDFKLRQYLIQNFDLYKKFPKDSKIKVIMKDGGYYTFELNKKLQTNRMS 

DVI DGRN IEKIE AN I R " 
25 SEQ ID No: 7 - SSL7 

148880. . 149575 
/gene="setll N315ssl7" 
148880. .149575 
/gene«"setll N315ssI7" 
/note=*"ORFID: SAO 3 87 
[Pathogenicity island SaPIn2]" 
/codon_start=l 
/ 1 r ans l_t abl e-1 1 
/product=" (exotoxin 11)SSL7" 
/protein id^" BAB41615 . 1 " 
/dbjxref«"GI: 13700317" 

/translation="MKLKTIiAKATLALGLLTTGVITSEGQAVHAKEKQERVQHLYDIK 
40 DLYRYYS SE S FEFSNI SGKVENYNG SNWRFNQEKQNHQLFLLGKDKDKYKKGLEGQN 
VFWKEL I DPNGRLS T VGGVTKKNNKS SETNTHL FVNKVYGGNLDAS IDSFL INKEE V 
SLKELDFKIRKQLVEKYGLYKGTTKYGKITINLKDEKKEVIDLGDKLQFERMGDVLNS 

KDIQNIAVTINQI" 



149914. .150612 
/gene="setl2 N315ss28" 
149914. .150612 
/gene="setl2 N315sslS" 
/note="ORFID:SA0388 
[Pathogenicity island SaPIn2]" 
/codon_start~l 
/ t r an s 1_ t ab 1 e ■ 11 
/product=" (exotoxin 12) SSL8" 
/protein_id=" BAB41616.1 " 
/db_xref="GI: 13700318" 

/ trans la tion="MKFTAIAKAIBVLGILTTSVMITENQSVNAKGKYEKMNRLYDTN 
60 KLHQYYSGPSYELTNVSGQSQGYYDSNVLLFNQQNQKFQVFLLGKDENKYKEKTHGLD 
VFAVPELVDLDGRIFSVSGVTKKNVKSIFESLRTPNLLVKKIDDKDGFSIDEFFFIQK 
EEVSLKELDFKIRKLLIKKYKLYEGSADKGRIVINMKDENKYEIDLSDKLDFERMADV 

INSEQIKNIEVNLK" 

65 



gene 
CDS 

10 
15 



gene 
CDS 

30 
35 



45 

SEQ ID No: 8 - SSL8 
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SEQ ID No: 9 - SSL9 

gene 150989 151687 

/gene="setl3 N315ssl5>" 
5 CDS 150989. .151687 

/gene="setl3 N315ss29" 
/note="ORFID:SA0389 
[Pathogenicity island SaPIn2]" 
/codon_start=l 
1 0 /trans 1 table=ll 

/product-" (exotoxin 13)SSL9" 
/protein id°" BAB41617 . 1 " 
/db_xref- T, GI: 13700319" 

15 /translation^" MKLTALAKATIALGI LTTGVFTAE SKAVHAKVELDETQRKYYIN 
MLHQYYSEESFESTNISVKSEDYYGSNVLNFNQRNKTFKVFLLGDDKNKYKEKTHGLD 
VFAVPELIDIKGGIYSVGGITKKNTOSVFGFVSNPSLQVKKVDAKHGFSINELFFIQK 
EEVSLKELDFKIRKMLVEKYRLYKGASDKGRIVINMKDEKPTYV-IDLSEKLSFDRMFDV 
MDSKQIKNIEVNLN" 

20 



SEQ ID No: 10 - SSL10 

25 gene 152053 152736 

/gene="setl4 N315ss2I0" 
CDS 152053. .152736 

/gene="setl4 N315ssll0" 

/note="ORFID: SA03 90 
30 [Pathogenicity island SaPIn2]" 

/codon_start=l 

/transltable^ll 

/product-" (exotoxin 14)SSL10" 

/protein id=" BAB41618 . 1 " 
35 /db_xref="GI : 13700320 - 

/ 1 rans 1 a t i on= " MKFTALAKATLALGILTTGTLTTE VHSGHAKQNQKSVNKHDKEA 
LYRYYTGKTMEMPCNISALKHGKNNLRFKFRGIKIQVLLPGNDKSKFQQRSYEGLDVFF 
VQEKRDKHDIFYTVGGVIQNNKTSGWSAPILNISKEKGEDAFVKGYPYYIKKEKITL 
40 KELDYKLRKHLIEKYGLYJCTISKDGRVKISLKDGSFYNLDLRSKLKFKYMGEVIESKQ 
IKDIEVNLK" 

SEQ ID No: 11 - SSL11 , 

45 gene 156143 156826 

/gene=" (setl5) N315sslll" 
CDS 156143. .156826 

/gene=" (setl5) mibsslll" 

/note="ORFID: SA0393 
50 [Pathogenicity island SaPIn2]" 

/codon_start sss l 

/transl table=ll 

/product-" (exotoxin 15)SSL11" 

/protein id=" BAB41622 . 1 " 
55 /db_xref-"GI: 13700324" 

/translation= , 'MKLKNIAPCASLALGILTTGMITTTAQPVKASTLEVRSQATQDLS 
EYYKGRGFELTNVTGYKYGNKVTFIDNSQQIDVTLTGNEKLTVKDDDEVSNVDVFWR 
EGSDKSAITTSIGGITKTNGTQHKDTVQNVNLSVSKSTGQHTTSVTSEYYSIYKEEIS 
60 LKELDFKLRKHLIDKHDLYKTEPKDSKIRITMKNGGYYTFELNKKLQPHRMGDTIDSR 
NIEKIEVNL" 

SEQ ID No: 12 

65 244981 agtgcatgaa gtataagtca ccttcatata ctaatcaaag aggacgtcaa cagttatttt 



7 



WO 2005/092918 



PCT/GB2005/001084 



10 



15 



20 



25 



30 



35 



40 



45 



50 



245041 
245101 
245161 
245221 
245281 
245341 
245401 
245461 
245521 
245581 
245641 
245701 
245761 
245821 
245881 
245941 
246001 
246061 
246121 
246181 
246241 
246301 
246361 
246421 
246481 
246541 
246601 
246661 
246721 
246781 
246841 
246901 
246961 
247021 
247081 
247141 
247201 
247261 
247321 
247381 
247441 
247501 
247561 
247621 
247681 
247741 
247801 
247861 
247921 
247981 
248041 
-248101 



attaggattt 
agtgtagtta 
taatatttta 
tactggttta 
gtattgcaca 
tgtaaatgtc 
cacatctacg 
agatatatgt 
ttgatatagc 
attgatgtta 
tactggtttt 
tatatttttc 
aggaataaga 
ctccattatt 
tttttaacat 
ttattgtctg 
ttacttttaa 
ttattgcctt 
acattcagtt 
gtttcttgag 
acgcttttat 
tgactatacc 
aacgtactgt 
ttagtttcgt 
ttaggaaatt 
ttcgtgatgt 
aaaaaggggt 
agatttttcg 
gtcagttaat 
cttaatttga 
acgaatacgg 
atgtccaccg 
atcagatgaa 
accttcaact 
attaatgttt 
ttgcaatgtt 
aaaatgcggc 
attataggct 
ggttcctaaa 
atgaaaatct 
taaataaaag 
aacttatcat 
tattaaaaac 
gaattttcta 
tcctactact 
aattttaata 
aaaaggagtt 
tttcacgaca 
atattggcac 
attctacaag 
cttatttagg 
gtgtgcttgg 



ttaacataaa 
ttaccgccac 
ttttttatta 
ttgccttggt 
aaaagattat 
tctctataag 
taacgatctt 
gctgaccatg 
catttatttg 
gtttcatctt 
tgattttgag 
gtgatgttct 
aatttaaatg 
attgttagtt 
atctatttgc 
tacctgttat 
ttaatgtttg 
cataaactgg 
tatagtgttt 
ataatgcaaa 
ttcttaattg 
aagaaaactg 
tagagaaggt 
tttgatttat 
gtgtaaatgc 
tattgttcat 
taattagata 
agtatgactt 
tttaactttt 
cctttattaa 
aaatctaatt 
atgatagttt 
gtagctgctg 
gtgtcctcat 
ccccaactct 
gcatgaatgc 
tgagtgtaat 
tttgcttcag 
agtaacaatg 
cctttgcgtg 
ctaaaactat 
ttttaagttt 
gacacgttac 
actacttgaa 
taaatttaat 
gtaggttgtg 
gttattaatg 
agaggtagaa 
tgaaaagcct 
aacgcgatgt 
cccaactgga 
tggaatgtat 



catttgctag 
cggtgatctt 
acgcttctcg 
atattttatg 
aagttttatc 
accaccataa 
ttaactgatt 
aatatctctt 
aaaatgaaaa 
gaccaatgct 
gtaatacagt 
tactcattag 
tgagcgaagt 
tgatttttcg 
atcagttgat 
tttaatttgt 
acgaatacga 
aaatccgcca 
atcgttagat 
agaatcaatg 
gttgatattg 
taacaacgtt 
taaatatttt 
ataataagct 
tgtacctaat 
tcgaatttct 
attgaaatta 
caatttgtgc 
tacttaaatc 
attctccgtt 
cttttaaagt 
gttgtttatc 
gcgtaacacc 
ctttagttcc 
cggatccaaa 
taccattggg 
atttgattaa 
atgaaaaact 
ctgttgataa 
aattacccaa 
gttaaataaa 
tggacagaaa 
aattattctc 
aaatagttat 
attaataaaa 
tttattttgt 
aaaaatttac 
ttcttattaa 
atgttaaaaa 
gcatttgaag 
tcacaaatgg 
gatggcattg 



atctgaatgt 
aagcttacct 
tgcacggaaa 
cgcaccaata 
agaaggtttt 
ctgatcagta 
aatatttccc 
aagttcaaca 
atgaggctgt 
ataagcttta 
acctaatagt 
aacatctcct 
caatatagta 
aggataactt 
ggcaaccttt 
cctttattat 
aaatctaatt 
gtaaacgttt 
gttgctgcag 
gtttcttggt 
ccccaacttt 
gcatgaatcg 
tgcgagtaat 
ttcgcttcag 
agtaacaatg 
cctttgagta 
tccgcattta 
atttttagga 
aatcgtgtaa 
atataacttt 
taatactggt 
actatatttt 
accagtaaat 
aaatatatca 
cacttgaata 
cttttgccat 
ctcattgata 
gattggtgtt 
aactaattta 
agtatataag 
cttaaacagt 
cagtacttaa 
taatcaattg 
actttaaatg 
tgttcattta 
atgcgcttac 
gaaacagaag 
cactctccga 
ataaaaatat 
ttgcagcgca 
gtaaaaaaga 
aataccgtgg 



aatcttttgc 

ttattacgat 

tcgatttctt 

atcgtttgta 

gcggctggtg 

tctttgtctt 

caactttcag 

taaatgtttc 

gtgtaatatt 

gcttcagagt 

aataatgttg 

ttcagaggaa 

tttgcgatta 

caatttttgc 

tacttaaatc 

aagaattatt 

cttttaaagt 

ctgctttatc 

gagtaacacc 

cttttatgcc 

caggtccata 

taccgttatc 

atttagttaa 

atgaagaatt 

ttgttgataa 

ttgttggaat 

caaaaggtaa 

tttttaacat 

ttatttccat 

ttattcttta 

ttatttcctt 

aaaaatagtc 

gtttcatcat 

acgtatttat 

tgactatacc 

agccattttc 

ttagtctcgt 

ttaggaagtt 

ttcatgatgt 

ctattacacc 

tagtagtgtt 

taaagtaggc 

cattaaattg 

tagtacttat 

attattgata 

aatttaggtg 

ttttttaact 

ggatttaaaa 

tgcactgtta 

tgatcaaggt 

aacaactaaa 

tttttcacaa 



ttaaatcaat 

tttcggtata 

tcaatgttaa 

gtttatcttt 

taacgccacc 

ttagtccaaa 

cgccccataa 

cattatcata 

taattaattc 

aaaaactaaa 

tcgttaaaat 

tcatgatacg 

tttttattaa 

attttgaggt 

tattgtgtag 

atataatttt - 

taaaacaggc 

tttatatgtt 

accagtaaac 

aaaaatatca 

aacttgaata 

tttttgccat 

ctcattaacg 

gataggtgta 

aataattttt 

gtttaattat 

taggttagtt 

aacggtttgt 

cagcagttat 

ttaatgtttg 

tgtaaaattc 

tataaggttt 

aagtccagta 

ttcttaactg 

aaacccacgt 

cagataatga 

tttcactgat 

gtgttgatgt 

tctttttcat 

gattcggaat 

atttaagcaa 

gggagttata 

tttgataatt 

tttaattatt 

aaatattaca 

taactaaaat 

ttattagact 

cgtgctaaat 

tttgaaaaag 

gcaaatgtaa 

gatactgcac 

agaacagtag 



55 



60 



65 



SEQ ID No: 13 - SSL14 
gene 

CDS 



complement (246655. .247380) 
/gene="SA1011 mibssll4" 
complement (246655. .247380) 
/gene="SA1011 N315ssII4" 
/ not e= " ORFI D : S Al 011" 
/codon_start=l 
/trans l_table= 11 
/protein id*=" BAB4 22 63 . 1 " 
/db_xref="GI : 13700967" 
/product=" (SA1011) SSL14 



70 



/translation="MKKNIMNKLVLSTALLLLGTTSTQLPKTPISFSSEAKAYNISEN 
ETNINELIKYYTQPHFSLSGKWLWQKPNGSIHATLQTWVWYSHIQVFGSESWGNINQL 
RNKYVDI FGTKDEDT VEG YWT YDET FTGG VT PAATS S DKP YRL FLKYS DKQQT 1 1 GGH 
EFYKGNKPVLTLKELDFRIRQTLIKNKKLYNGEFNKGQIKITADGNNYTIDLSKKLKL 
TDTNRYVKNPKNAQIEVILEKSN" 



8 



WO 2005/092918 



PCT/GB2005/001084 



SEQ ID No: 14 - SSL13 

complement (245835. .246560) 
/gene=" (SA1010) N315ssll3" 
complement (245835. .24 6560) 
/gene=" (SA1010) K315ssll3 n 
/note* " ORFI D : SA10 10" 
/codon^start^l 
/ 1 r ans l_tabl e=ll 
/protein id=" BAB422 62 . 1 " 
/db__xref="GI : 13700966" 
/product*" (SA1010) SSL13" 

15 /translation="MI<n^ITKKIILSTTLLLLGTAFTQFPNTPINSSSEAKAYYINQN 
ETOVNELTKYYSQKYLTFSNSTLWQKDNGTIHATLLQFSWYSHIQVYGPESWGNINQL 
RNKSVDIFGIKDQETIDSFALSQETFTGGVTPAATSNDKHYKLNVTYKDKAETFTGGF 
PVYEGNKPVLTLKELDFRIRQTLIKSKKLY13NSYNKGQIKITGTDNNYTIDLSKRLPS 
T DANRYVKKPQNAKIE VI LEKSN " 

20 

SEOIDNo:15- SSL12 



gene complement (245011 245727 ) 

/gene=" (SA1009) N315ssll2" 
25 CDS complement (245011 245727 ) 

/gene-" (SA1009)N315ssII2" 

/note* " ORFID : SA1 009" 

/codon_start=l 

/transl tableau 
30 /protein id=" BAB422 61 . 1 " 

/db_xref="GI : 13700965" 

/product* " ( SA1 009)SSL12" 

/trans lation*"MSKNITKNIILTTTLLLLGTVLPQNQKPVFSFYSEAKAYSIGQD 
35 ETKINELIKYYTQPHFSFSNKWLYQYDNGNIYVELKRYSWSAHISLWGAESWGNINQL 
KDRYVDVFGLKDKDTDQLWWSYRETFTGGVTPAAKPSDKTYNLFVQYKDKLQTIIGAH 
KI YQGNKPVLTLKEI DFRAREALIKNKILYTENRNKGKLKI TGGGNNYTI DLSKRLHS 
DLANVYVKNPNKITVDVLFD " 

40 S. aureus s -brain Mu50 taken from GenBank 

SEP ID No: 16 

1168 81 aaaaatcaaa tttaaataga ttggggctaa aaattatgaa atttaaagcg atagcaaaag 

45 116941 caagtttagc attgggaatg ttagcaacag gtgtaattac atcgaatgta caatcagtac 

117 001 aagcgaaaac agaagttaaa caacaaagtg aatcagagtt gaaacactat tataataaac 
117061 cggttttaga gcgtaaaaat gttactggat ataaatatac tgaaaaaggt aaagattata 
117121 tagatgttat agtagacaat caatattctc aaatttcttt agttggatct gataaagaca 
117181 aatttaaaga tggagacaac tcgaatatag atgtgtttat ccttagagaa ggtgacagta 

50 117241 gacaagcaac aaattactca attggtggcg taacaaaaac aaacagtcaa ccttttattg 

117301 actatataca cacaccaatc cttgaaatca agaaaggtaa agaagaacca caaagtagtt 
117361 tataccaaat ttataaagaa gacatctcat tgaaagaact tgattataga ttaagagaac 
117421 gtgcaatcaa acaacacggc ttgtattcaa atggtcttaa acaaggtcaa attacaatta 
117481 caatgaaaga tggcaaatca catactatcg atttaagtca aaaacttgaa aaagaacgta 

55 117541 tgggtgattc tatcgacggc agacaaatac aaaaaattct agtagaaatg aaataatact 

117 601 ttctaacaac aaagcgctat gttgaatagt gcttgttatg gaaatatatg gaagttaagc 
117 661 gacgtactgt tgcttagctt ctttttttga ggggaaaagt tacaaaactc acacaaacag 
117721 tcgcaccacg cattatcttt tgcttaaata gcttaatcat attttatgaa tagttaaaaa 
117781 caggttaatg tgaatatccg aatacagctc ctataatatg ggtgtatggt tcaaattacg 

60 117 841 taataaaaca atctaattat aatagattgg agcatacaac tatgaaaatg aaaaatattg 

117 901 caaaaataag tttgttatta ggaatattag caacaggtgt aaacactaca acggaaaaac 
117961 cagttcatgc cgaaaagaaa cctattgtaa taagtgaaaa tagcaaaaaa ttaaaagctt 
118021 attatactca acctagtatt gaatataaaa atgtgacagg ttatatcagt ttcattcaac 
118081 caagtattaa atttatgaat atcatagatg gtaattctgt taataacctt gctttaattg 



^ cjene 
CDS 
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118141 gcaaagataa gcaacattat catacgggtg tacatcgtaa tcttaatata ttttacgtta 
118201 atgaggataa gagatttgaa ggtgcaaagt actctattgg gggtatcact agtgcaaacg 
118261 ataaagctgt cgacctaata gcagaagcaa gagttattaa agcagatcat attggtgaat 
118321 atgattatga ctttttccca tttaaaatag ttaaagaagc gatgtcattg aaagagattg 
5 118381 attttaaatt aagaaaatac cttattgata attatggtct ttacggtgaa atgagtacag 

118441 ggaaaattac cgtcaaaaag aaatactatg gaaagtatac atttgaattg gataaaaagt 
118501 tacaagaaga ccgtatgtcc gatgttatca atgtcacaga tattgataga attgaaatca 
118561 aagttagaaa agcttaatac acatacttga cgacgaaata atttgaaatt gaaatagaga 
118621 ggttaagtga cgatcaaacg ttgcttaact tctttttaat gcttaaaaat catttcaaag 

10 118681 gcacatagaa acgctatatt aacctcataa tcactcatta ttttttgctt aaattactta 

118741 ataatacttc aataattgtt aaaaagggtt taatgtgatt atcttagaac gccatctata 
118801 atgatgttgt atgattcaaa ttacgtaaaa agacaatcga atataatata gattggagta 
1188 61 tacaattatg aatatgaaaa caattgctaa aaccagttta gcactagggc ttttaacaac 
118921 aggcgcaatt acagtaacga cgcaatcggt caaagcagaa aaaatacaat caactaaagt 

15 118981 tgacaaagta ccaacgctta aagcagagcg attagcaatg ataaacataa cagcaggtgc 

119041 aaattcagcg acaacacaag cagctaacac aagacaagaa cgcacgccta aactcgaaaa 
119101 ggcaccaaat actaatgagg aaaaaacctc agcttccaaa atagaaaaaa tatcacaacc 
119161 taaacaagaa gagcagaaat cgcttaatat atcagcaacg ccagcgccta aacaagaaca 
119221 atcacaaacg acaaccgaat ccacaacgcc gaaaactaaa gtgacaacac ctccatcaac 

20 119281 aaacacgcca caaccaatgc aatctactaa atcagacaca ccacaatctc caaccataaa 

119341 acaagcacaa acagatatga ctcctaaata tgaagattta agagcgtatt acacgaaacc 
119401 gagttttgaa tttgaaaagc agtttggatt tttgctcaaa ccatggacga cggttaggtt 
1194 61 tatgaatgtt attccaaata ggttcatcta taaaatagct ttagttggaa aagatgagaa 
119521 aaaatataaa gatggacctt acgataatat cgatgtattt atcgttttag aagacaataa 

25 119581 atatcaatta aaaaaatatt ctgtcggtgg catcacgaag actaatagta aaaaagttga 

119641 tcacaaagca gaattaagcg ttactaaaaa agataatcaa ggtatgattt cacgcgatgt 
119701 ttcagaatac atgattacta aggaagagat ttccttgaaa gagcttgatt ttaaattgag 
1197 61 aaaacaactt attgaaaaac ataatcttta cggtaacatg ggttcaggaa caatcgttat 
119821 taaaatgaaa aacggtggga agtatacgtt tgaattacac aaaaaactgc aagagcatcg 

30 119881 tatggcagat gtcatagaag gtacaaacat tgataaaatt gaagtgaata taaaataatc 

119941 atgacgttct ctaaatagaa gctgacatcg gtaaaacaag aagttaagtg acaacggttt 
120001 acatgttgct tagcttcttt tattatgcgt aatgatgtaa aagacgaata ttcatttgtt 
120061 tgtaaaagtg gcatttctat gtcttaaaag tgacgaatcc tcaaatgtgc caagtgttga 
120121 atcacatcaa aatcagtttt atttaacgaa cattatggat ttcttaattt acttaacgat 

35 120181 gattcaaata tagttaaaca aggtttaatg tgaatggagc aatacgccat ctataataaa 

120241 gctgtatgat tcaatgaatg taatcgaaca aatctaataa ttacgaatgg agcatacaac 
120301 tatgaaaatg gcagcaattg cgaaagcaag tttagcatta ggtattttag caacaggaac 
120361 aataacgtca ttgcatcaaa ctgtaaatgc gagtgaacat gaagcaaaat atgaaaatgt 
120421 gacaaaagat atctttgact taagagatta ctatagtggc gcaagtaagg aacttaaaaa 

40 120481 tgttactggt tatcgttata gcaaaggtgg caagcattac cttatctttg ataaaaatag 

120541 aaaattcaca agagtacaga tatttggtaa agatattgaa agatttaaag cacgcaaaaa 
120 601 tccgggatta gacatatttg ttgttaaaga agcagaaaat cgcaacggca cagtgttttc 
120 661 atatggtggt gtcactaaga aaaatcaaga cgcttattat gattatataa acgcaccaag 
120721 atttcaaatc aagagagatg aaggtgacgg tattgctacg tacggtagag tacactacat 

45 120781 ttataaagaa gagatttcac ttaaagaact cgactttaaa ttgagacagt atttaattca 

120841 aaattttgat ctgtataaaa agtttcctaa agatagtaag ataaaagtga taatgaaaga 
120901 tggcggctat tatacgtttg aacttaataa aaaattacaa acaaatcgca tgagtgatgt 
120961 cattgacggt agaaatattg aaaaaataga agccaacatt agataattca atgaaatatg 
121021 gataatagta aaatatggat agtatagagg agttaggcaa cataagttgc ttagcttctt 

50 121081 ttttgtgttg gcgagatgaa aatgaagcgt atcgatgaat aataaaaaca ccaataaaac 

121141 ttgtggaaat agttgatact tatagatgcg tgatgtcgct ttagtgacat gaaacaatgt 
121201 ggaaaacata attaaattga gggaaagtgt gaatagttaa aaaattagca ttgtgttata 
1212 61 aaaaataatt aatactgtta ggatttcatt aactaactta acgttggttc aaaaatagtt 
121321 aaaaagaggt taattcatag cgcagtatct cgcttatata atgatagtag attgttcgta 

55 121381 ttacgtaatt gaattaatca tataaaaata tattaagaca aaatttataa atagattggg 

121441 agaatagtac tatgaaatta aaaacgttag ctaaagcaac attggcatta ggcttattaa 
121501 ctactggtgt gattacatca gaaggccaag cagtccacgc aaaagaaaag caagagagag 
121561 tacaacattt atatgatatt aaagacttat atcgatacta ctcatcagaa agttttgaat 
121621 tcagtaatat tagtggtaag gttgaaaact ataacggttc taacgttgta cgctttaacc 

60 121681 aagaaaaaca aaatcaccaa ttattcttat taggaaaaga taaagataaa tataaaaaag 

121741 gccttgaagg ccagaatgtc tttgtggtaa aagaattaat tgatccaaac ggtagactat 
121801 ctactgttgg tggtgtgact aagaaaaata acaaatcttc tgaaactaat acacatttat 
1218 61 ttgttaataa agtgtatggc ggaaatttag atgcatcaat tgactcattt ttaattaata 
121921 aagaagaagt ttcactgaaa gaacttgatt tcaaaattag aaagcaatta gttgaaaaat 

65 121981 atggtttata taaaggtacg actaaatacg gtaagatcac tatcaatttg aaagacgaga 
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122041 aaaaggaagt aattgattta ggtgataaac tgcaattcga gcgcatgggt gatgtgttga 
122101 atagtaagga tattcaaaat atagcagtga ctattaatca aatttaaatt aaacaatcaa 
122161 tgactttaaa gtaataaatt tgaagcagct taacgatgaa atgttgaata aatacgtaca 
122221 tcctccaaaa aggggcgtat ctaaatcaac agtgtcgtta ggctgttttt atgttttata 
5 122281 acgcagggta tgagcgtact aaaaattcac attacttctg aaagtgatgt ccattgaata 

122341 ttaattagtt cttcattaac catgatttaa ttttaattaa acgagtgtta atgtcagtct 
1224 01 gtctcaatgc cctttataat aaatgtgtat tattcaaatt acgtaataaa agcaatccaa 
1224 61 tatattaaga ttggagcata tgaatatgaa atttacagcg atagctaaag cgatatttgt 
122521 attaggaata ttaacaacaa gtgtaatgat aacagaaaat caatcggtta atgcaaaagg 

10 122581 aaagtatgaa aaaatgaacc gtttatatga tacaaacaag ttacatcaat actattcagg 

122641 acctagttat gagttaacaa atgttagtgg ccaaagtcaa ggttattatg actctaacgt 
1227 01 tttgcttttt aaccaacaaa atcaaaagtt ccaagtattt ttattgggaa aagatgaaaa 
1227 61 taaatacaaa gaaaaaacac atggtttaga tgtctttgcg gtaccagaat tagtagattt 
122821 agatggaaga atatttagtg ttagtggtgt aacgaagaaa aacgtaaaat caatatttga 

15 1228 81 gtctctaaga acgccgaact tactagttaa aaaaatagac gataaagacg gtttttctat 

122941 tgatgaattt ttctttattc aaaaggaaga agtgtcattg aaggaacttg attttaaaat 
1230 01 aagaaaactg ttgattaaaa aatacaaact gtatgaaggg tcagctgata aaggtagaat 
123061 tgttattaat atgaaagatg aaaataagta tgaaattgat ttaagtgata aattagattt 
123121 cgagcgtatg gcagatgtca ttaatagtga acaaattaag aacatcgaag tgaatttgaa 

20 123181 ataatcaatg atatatagaa taaaagctta agaagcggtt taataatccc atgtttaatg 

123241 attttgatac gtgttttaat aataaaaaca tatcgaacat tgactacgtt attaagctgc 
123301 ttttttgtac actttgtatc gaataactta agatctaaaa ctaatcggaa agaacaatga 
1233 61 ttcccccaaa aaaatttatg ttgctattaa aaatcagtta atacgaatgt taaaatacgt 
123421 ttgattttca ttaataatga ttcaagttta tttaaatgag cgttaatgtc agtctgtttt 

25 1234 81 gatgcacctt ataataaaga cagatagttc aaattacgta atcataacaa tccaatatat 

123541 caagattgga gcaaataaat atgaaattga ctgcattagc aaaagcaaca ttagcattag 

123 601 gaatattaac tacaggtgtg tttacagcag aaagtaaagc tgttcacgcg aaagtagaac 
123661 ttgatgagac acaacgcaaa tattatatca atatgctaca tcaatactat tctgaagaaa 
123721 gttttgaatc aacaaacatt agcgttaaaa gtgaagatta ctatggttct aacgttttaa 

30 123781 actttaacca acgaaataaa actttcaaag tgtttctgct tggtgacgat aaaaataaat 

123841 ataaagaaaa aacacatggc cttgatgtct ttgcagtacc tgaattaata gatataaaag 
123901 gtggcatata tagcgttggc ggtataacaa agaaaaatgt gagatcagtg tttggatttg 
123961 taagtaatcc aagtctacaa gttaaaaaag ttgatgctaa acatggcttt tcgataaatg 
12 4021 agttgttctt tattcaaaag gaagaagtat cgttgaagga actggatttt aaaataagaa 
35 124081 aaatgttagt cgaaaaatat agattgtata aaggcgcgtc agataaaggt agaatcgtta 

124141 ttaatatgaa agacgaaaag aaatatgtaa ttgatttaag tgaaaaatta agttttgatc 
12 4201 gtatgtttga tgtaatggat agtaagcaaa ttaaaaatat tgaagtgaat ttgaattagt 
12 42 61 ttgagttaat agcataatag cttaagaagc gacttaacga caaaatgtga attgacatct 
124321 gtgtccttat ataaggaact gtgttaaata cattactgtt gttaagttgt ttttgtaatt 
40 124381 caaagagcag aacagagtaa catcatcagt tgtagtaaac gataatccag taaaacaact 

124441 aaatgaaata atgaaagtca tttaacctga acattaaaat atatttgttt ttcattaaga 
124501 ataattcaag tatatttaaa tcgaggttaa ttatcgtatg aaacgatgca cgttataata 
124561 aaaatgtatg attcaaatta cgtaatgaaa acaatccaat atattaagat tggagcaaat 

124 621 aaatatgaaa tttacagcat tagcaaaagc aacattagct ttaggaattt taacaacagg 
124 681 aactttaaca acagaagttc attcaggtca tgcaaaacaa aatcaaaagt cagtaaataa 
124741 acatgacaag gaggcactat accgatacta cactggaaag actatggaaa tgaaaaatat 
12 4801 tagtgctttg aaacatggta aaaataactt gcgttttaag tttagaggta ttaagattca 
1248 61 agttttactg cctggaaatg ataaaagtaa atttcaacag cgtagttatg aggggttaga 
12 4 921 tgtgtttttt gttcaagaaa aaagagataa gcacgatata ttttatactg ttggtggtgt 
12 4 981 aatacagaat aataaaacat ctggagttgt cagtgcacca atattaaata tttcaaaaga 
125041 aaagggtgaa gatgcttttg tgaaaggtta cccttattac attaaaaaag aaaaaataac 
125101 actaaaagag ctggattata agttgagaaa gcatctaatt gaaaaatacg gactttataa 
125161 aacaatctca aaagatggta gggtcaaaat tagcttgaaa gatggcagtt tttataacct 
125221 tgatttaaga tctaaattaa aatttaaata tatgggggaa gtcatagaaa gcaaacaaat 
125281 taaagatatt gaagttaact taaagtaaat aattacgaat aataaaaagt aattgaagcg 
125341 gcttaacggt gaaatgtaaa ttggtgcgca tagcttatac aaaaaggatg catcaatcga 
125401 tatcgtcgtt aagccgtttt tgtttgtgtg tcatgaatcc tatcccaatc tccatgaata 
1254 61 taaaaattcc accaccaaca tcaaaattct ccacatcgca acataaccaa atgttataat 
125521 aaatctatta cacaaagaga taaattactt attcaaaggc ggaggaatca catgtctatt 
125581 actgaaaaac aacgtcagca acaagctgaa ttacataaaa aattatggtc gattgcgaat 
125641 gatttaagag ggaatatgga tgcgagtgaa ttccgtaatt acattttagg cttgattttc 
125701 tatcgcttct tatctgaaaa agcggaacaa gaatatgcag atgccttgtc aggtgaagat 
1257 61 attacgtatc aagaagcatg ggcagatgaa gaatatcgtg aagacttaaa agctgaatta 
125821 attgatcaag tcggttactt cattgaacca caagatttat tcagtgcgat gattcgtgaa 
125881 attgaaacgc aagatttcga tatcgaacat ctcgcaacgg cgattcgtaa agttgaaaca 
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15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



125941 
126001 
126061 
126121 
126181 
126241 
126301 
126361 
126421 
126481 
126541 
126601 
126661 
126721 
126781 
126841 
126901 
126961 
127021 
127081 
127141 
127201 
127261 
127321 
127381 
127441 
127501 
127561 
127621 
127681 
127741 
127801 
127861 
127921 
127981 
128041 
128101 
128161 
128221 
128281 
128341 
128401 
128461 
128521 
128581 
128641 
128701 
128761 
128821 
128881 
128941 
129001 
129061 
129121 
129181 
129241 
129301 
129361 



tcaacactag 
agttcaacgc 
gttaaccttg 
gcatacgaat 
tatacaccac 
ttacgtcacg 
gaaacgcaag 
cgcatgaaca 
acgttggaaa 
tacagtgcga 
ggcaagcttg 
ctagacgatg 
gcagaaggcg 
ggcttaccag 
aaatgttgcc 
ggaaaaaatc 
cgtaaagaaa 
gattacaacc 
ttagatcaag 
gaaatcaatg 
aaatgtgcca 
ggatcttaca 
aatatccgga 
gaaaaatcta 
ttctaatgga 
gtcctctttg 
atattttgat 
aaatcacgga 
aagtttagaa 
attagaagaa 
tttctcacag 
tagcaaaata 
ttcagtaact 
aagtaaagat 
gagaatgtgg 
tactgtgctt 
acatagaatg 
cttaaaatat 
aaagataggt 
tgaaatatta 
aatacataga 
agtgaattaa 
cgaaattgtg 
taaataaaag 
acttaataat 
ttataatgaa 
gggagcatag 
taacaacagg 
gatcacaagc 
atgtaacagg 
atgttacatt 
ttgacgtgtt 
gaattacaaa 
tttctaagag 
aagaagaaat 
atgatcttta 
gctattatac 
atagtagaaa 



gtgaagaaag 
gactaggtaa 
acgacttacc 
ttcttatcgg 
aacaagtatc 
tgtatgaccc 
tgtatcgtta 
tgttattaca 
atccagcttt 
aatggactgc 
cgccaaaatc 
aaggtaccat 
tcattcgtcg 
tgaatatttt 
aacaagacga 
aaaaccattt 
caattgataa 
taaacatacc 
tccaacaaga 
catacctgaa 
gagttgagat 
gatagagtaa 
cagttaggtt 
gaaaattata 
tacccattag 
tatatttgtt 
tcgacacact 
ttattaaatg 
gaacagcaaa 
caaaagcttg 
gaactgcgat 
gaaaaatatt 
ataaatagtg 
aaaagtaatt 
caaggggcta 
tatccaacac 
attcataaat 
aaacaattaa 
gatttcttta 
gaaaaagaga 
ttgcataaga 
aaacgaacat 
tacagaataa 
agagtagatc 
gattaatttt 
tgaggattgt 
aattatgaaa 
tatgattaca 
tactcaagac 
atataaatac 
gacaggaaat 
tgtagtaaga 
gacaaatggg 
tacaggtcaa 
ctcattaaaa 
taagacagaa 
atttgaatta 
tatagagaaa 



tgaaaatgac 
caatgtcaaa 
attcgttcac 
gcgctttgcg 
taagatactg 
aacatgtggt 
tttcggtcaa 
tgatgtgcgc 
cttaggcaat 
agattcaaaa 
caaagcagac 
ggcagttgta 
ttatttaatt 
ctatgggaca 
caacgtatta 
aagcgatgcc 
atacagttac 
gaggtatgtc 
tttgaaaaat 
agaacttggg 
tcccagggtt 
ttaggaaaaa 
taattgatca 
cactaataaa 
gggctattaa 
tttctattaa 
ggtatagaga 
tttctgtgaa 
aaataggcaa 
aattacttca 
tcaaagatga 
taaaagagag 
gcattataaa 
ataaagtagt 
gtggtaaatc 
aaaatactag 
ttaaaattaa 
aaaatataaa 
aaaaaatgga 
aacaatcctt 
ataaaatttg 
taaatttagg 
gtagttaaat 
gataggaatt 
tagttaaagt 
tcgtattgcg 
ttaaaaaata 
actactgctc 
ttgagtgaat 
ggaaataaag 
gaaaaattaa 
gaaggtagtg 
actcaacata 
cacactactt 
gaacttgatt 
cctaaagaca 
aataaaaaat 
attgaagtga 



tttatcggac 
gaacgtactg 
agtgatatgg 
gcgacagcag 
gcgaagattg 
tcaggttcat 
gaacgtaaca 
tatgaaaact 
acatttgatg 
tttgaaaatg 
tttgccttta 
ctcccacatg 
gaagaaaaga 
agtattccaa 
tttatcgatg 
caagtcgaac 
agtgcgacat 
gatacattcg 
atcgacaaag 
gtgttgaaag 
tgaaggcgaa 
taaaaactta 
aacagaatat 
gaatggagaa 
aagattaact 
aagtgaaatg 
agtttctgga 
tgattttttt 
gttcttcagc 
acaacagaaa 
gaatggtgaa 
aaacgaacgt 
atttagtgaa 
taggaaaaat 
aaattataat 
ctcattattt 
ttcacaagga 
tatagatata 
tatattgata 
tttacaaaaa 
tataatttaa 
cactgtgaaa 
aaagattaag 
gaatgatatt 
aagtttaatg 
taatagaata 
ttgctaaagc 
agccagtaaa 
attataaagg 
ttacatttat 
ctgttaaaga 
acaaatcagc 
aagatactgt 
ctgtgacttc 
ttaaattaag 
gtaaaattag 
tacagcctca 
atttataata 



tgttcagcga 

cactaatttc 

aaattgatat 

gtaaaaaagc 

tcacagacgg 

tattgttacg 

ataccactta 

ttgatatccg 

cggttatagc 

atgaacgctt 

ttcaacatat 

gtgtcttatt 

actacttaga 

catgtatctt 

catccaatga 

gtattattga 

tacaagagat 

aagaagaagc 

aaatcgcaga 

atgagtaata 

tgggaagaga 

gaatcgaaaa 

tttagtaaat 

ttcgcgtata 

agatatgata 

tctaaagact 

attgcagttg 

actattctaa 

aaactcgacc 

aaaggctata 

gattatccag 

tctgacaaag 

ttggatagaa 

gatattgcat 

gggattgtta 

attggatata 

ttaacatcag 

cctgtattgg 

agtaaacaga 

atgttcttat 

cataaaagtt 

gcgcagtgtc 

ttgagataaa 

agttaactat 

tgaagcacga 

aatcaaatag 

aagtttagca 

agcaagtaca 

gagaggattt 

agataattct 

tgatgacgaa 

tatcacaaca 

tcaaaacgtt 

agaatactat 

aaagcattta 

aattactatg 

tcgcatgggt 

atattcgagg 



tatggactta 

caaagttatg 

gttaggtgat 

aggcgagttc 

taaagataaa 

tgttggtaaa 

caacttagcg 

taatgatgat 

gaacccacca 

cagtggttac 

ggtacattac 

ccgtggtgcc 

agccgtgatt 

agtatttaaa 

ttttgaaaaa 

cacatacaag 

tgccgataac 

gccaattgat 

aattgaacaa 

cacaaaagaa 

agcagttggg 

agcctttaac 

cagtttcgtc 

acaaaagtta 

gtggtgtatt 

tcatggaagc 

agggtgcaag 

ttaaatatcc 

gacaaattga 

tgcagaaaat 

attgggaaaa 

ggcaaatgct 

aagataattc 

ataattctat 

gccctgcata 

agtttaaaac 

atacatggaa 

aggaacaaga 

aaatgaaaat 

aactttgata 

gtaaaagtaa 

ttttttgtgt 

gtgttattcg 

ttattaaatt 

ccattgctca 

actaaaaatt 

ctagggattt 

ttagaggtta 

gaattaacaa 

caacaaattg 

gtttctaatg 

tcgattggtg 

aatttgtcag 

agcatctata 

attgataaac 

aaaaacggtg 

gatacgattg 

gagtatatca 



SEOIDNo:17- SSL1 

Gene 
CDS 



/gene-" (set6) MuSOssll" 
116916. .117596 
/gene=" (set6) Mu50ssI2" 
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/note="Pathogenicity island SaPIn2 
SAV0422" 
/codon_start s = t l 
/transl table=ll 
5 /product-" (exotoxin 6) SSL1" 

/protein id-" BAB56584 . 1 " 
/db_xref= B GI: 14246190" 

/translation^"^FKAIAKASLALGMLATGVITSNVQSVQAKTEVKQQSESELKH 
10 Y YNKP VLERKNVTG YK YTEKGKD YI D VI VDNQ Y S Q I S LVG S DKDKFKDGDN SN I DVFI 
LREGDSRQATNYSIGGVTKTNSQPFI DYIHTPILEIKKGKEEPQSSLYQIYKEDISLK 
ELDYRLRERAIKQHGLYSNGLKQGQITITMKDGKSHTIDLSQKLEKERMGDSIDGRQI 
QKILVEMK" 

15 SEQ ID No: 18 - SSL2 

117882. .118577 
/gene=" (set7) Mu50ssl2" 
117882. .118577 
/gene=" (set7) Mu50ss22" 
/note«"Pathogenicity island SaPIn2 
SAV0423" 
/codon_start=l 
/transl tableau 
/product-" (exotoxin 7) SSL2" 
/protein id=" BAB56585 . 1 " 
/db_xref="GI: 14246191" 

/translation="MKMKNIAKISLLLGILATGVNTTTEKPVPIAEKKPIVISENSKKL 
30 KAY YTQP S IE YKN VT G Y I S FI Q P S IKFMN 1 1 DGN S VNNLAL I GKDKQH YHT G VHRNLN 
IFYVNEDKRFEGAKYSIGGITSANDKAVDLIAEARVIKADHIGEYDYDFFPFKIVKEA 
MSLKEIDFKLRKYLIDNYGLYGEMSTGKITVKKKYYGKYTFELDKKLQEDRMSDVINV 
TDIDRIEIKVRKA" 

35 

SEQ ID No: 19 - SSL3 

gene 118868 119938 

/gene-" (set 8) Mu50ssI3" 
CDS 118868. .119938 

/gene«" (set8) Mu50ssl3" 
/note«"Pathogenicity island SaPIn2 
SAV0424" 
/codon_start=l 
/trans 1 tabl e=l 1 
/product=" (exotoxin 8)SSL3" 
/protein id=" BAB56586 . 1 " 
/db_xref ="GI : 1424 61 92 " 

/translation="MNMKTIAKTSLALGLLTTGAITVTTQSVKAEKIQSTKVDKVPTL 
KAERLAMINITAGANSATTQAANTRQERTPKLEKAPNTNEEKTSASKIEKISQPKQEE 
QKSLNISATPAPKQEQSQTTTESTTPKTKVTTPPSTNTPQPMQSTKSDTPQSPTIKQA 
QTDMTPKYEDLRAYYTKPSFEFEKQFGFLLKPWTTVRFMNVIPNRFIYKIALVGKDEK 
KYKDGPYDNIDVFIVLEDNKYQLKKYSVGGITKTNSKKVDHKAELSVTKKDNQGMISR 
DVSEYMITKEEISLKELDFKLRKQLIEKHNLYGNMGSGTIVIKMKNGGKYTFELHKKL 
QEHRMADVIEGTNI DKIEVNIK" 

SEQ ID No: 20 - S5L5 

120302 . . 121006 
/gene-" (setlO) Mu50ssl5" 
120302. .121006 
/gene-" (setlO) Mu50ssl5" 
/note="Pathogenicity island SaPIn2] 
SAV0425" 
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20 



25 



gene 
CDS 



40 
45 



50 
55 



60 



65 



gene 
CDS 
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/codon_start=l 
/transl tableau 
/product-" (exotoxin 10) SSL5" 
/protein id«" BAB565B7 ■ 1 " 
5 /db_xref-"GI: 14246193" 

/translation="MKMAAIAKASLALGILATGTITSLHQTWASEHEAKYEWTKDI 
FDLRDYYSGASPCELKNVTGYRYSKGGKHYLIFDKNRKFTRVQIFGKDIERFKARKNPG 
LDIFWKEAENRNGTVFSYGGVTKKNQDAYYDYINAPRFQIKRDEGDGIATYGRVHYI 
10 YKEE I S LKEL D FKLRQYL I QN FDLYKKFPKDSKIKVIMKDGG YYT FE LNKKLQTNRMS 
DVIDGRNIEKIEANIR" 

SEQ ID No: 21 - SSL7 

15 gene 121452 .. 122147 

/gene-" (setll) Mu50ss27" 
CDS 121452. .122147 

/gene-" (setll) Mu50ssl7" 

/note-" Pathogenicity island SaPIn2 

20 SAV042 6" 

/codon_start-l 
/trans l_table=ll 
/product-" (exotoxin 11)SSL7" 
/protein id-" BAB56588 . 1 " 

25 /db_xref-"GI : 1424 6194 " 

/translation="MKLKTLAKATLALGLLTTGVITSEGQAVHAKEKQERVQHLYDIK 
DLYRYYSSESFEFSNISGKVENYNGSNWRFNQEKQNHQLFLLGKDKDKYKKGLEGQN 
VFWKELI DPNGRLS TVGGVTKKNNKS SETNTHLFVNKVYGGNLDAS I DS FLINKEEV 
30 SLKELDFKIRKQLVEKYGLYKGTTKYGKITINLKDEKKEVIDLGDKLQFERMGDVLNS 
KDIQNIAVTINQI" 



SEQ ID No: 22 - SSL8 

35 

gene 122486 123184 

" " /gene=" (setl2) Mu50sslS" 

CDS 122486. .123184 

/gene=" (setl2) MuSOsslS" 
40 /note="Pathogenicity island SaPIn2 

SAV0427" 
/ codon__s tart=l 
/transl table-11 
/product-" (exotoxin 12)SSL8" 
45 /protein id=" BAB5 658 9.1 " 

/db_xref="GI: 1424 6195" 

/ translation— "I^FTAIAKAIFVLGILTTSVMITENQSVNAKGKYEKMNRLYDTN 
KLHQYYSGPSYELTNVSGQSQGYYDSNVLLFNQQNQKFQVFLLGKDENKYKEKTHGLD 
50 VFAVPELVDLDGRIFSVSGVTKICNVKSIFESLRTPNLLVKKIDDKDGFSIDEFFFIQK 
EE VS LKELDFKI RKLL I KKYKLYEG S ADKGRI VINMKDENKYEI DLS DKLDFERMADV 
INSEQIKNIEVNLK" 

SEQ ID No: 23 - SSL9 

55 

gene 123561 .. 124259 

/gene-" (setl3) Mu50ssI5" 
CDS 123561. .124259 

/gene-" (setl3) Mu50ssl9" 

60 /note-"Pathogenicity island SaPIn2 

SAV0428" 
/codon_start-l 
/ trans It able-1 1 
/product-" (exotoxin 13)SSL9" 

65 /protein id-" BAB56590 . 1 " 
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30 



/db_xref="GI: 14246196" 
/translation— "MKLTALAKATLALG I LTTGVFT^ 

MLHQYYSEESFESTNISVKSEDYYGSNVLNFNQRNKTFKVFLLGDDKNKYKEKTHGLD 
VFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPSLQVKKVDAKHGFSINELFFIQK 
EEVSLKELDPKIRKMLVEKYRLYKGASDKGRIVINMKDEKKYVIDLSEKLSFDRMFDV 
MDSKQIKNIEVNLN" 

SEQ ID No: 24 - SSL10 



gene 124625 125308 

/gene-" (setl4) MubOssllO" 
CDS 124625. .125308 

/gene=" (setl4) MubOssllO" 
15 /note="Pathogenicity island SaPIn2 

SAV0 429" 

/codon_start-l 

/trans 1 table=ll 

/product-" (exotoxin 14)SSL10" 
20 /protein id=" BAB5 6591 . 1 " 

/db_xref-"GI: 1424 6197" 

/trans la tion="MKFTALAKATLALGILTTGTLTTEVHSGHAKQNQKSVNKHDKEA 
LYRYYTGKTMEMKNISALKHGKNNLRFKFRGIKIQVLLPGNDKSKFQQRSYEGLDVFF 
25 VQEKRDKHDIFYTVGGVIQNNKTSGWSAPILNISKEKGEDAFVKGYPYYIKKEKITL 
KELDYKLRKHLIEKYGLYKTISKDGRVKISLKDGSFYNLDLRSKLKFKYMGEVIESKQ 
IKDIEVNLK" 



SEQ ID No: 25 - SSLll 



gene 128715 129398 

/gene-" (setl5) Mu50sslll" 
CDS 128715. .129398 

/gene-" (setl5) MuSOsslll" 
35 /note-"Pathogenicity island SaPIn2 

SAV0433" 

/codon__start-l 

/transl table-11 

/product-" (exotoxin 15) SSLll" 
40 /protein id-" BAB56595 . 1 " 

/db_xref="GI : 14246201 " 

/translation— "MKLKN I AKAS LALGILTTGMITTTAQPVKASTLEVRSQATQDLS 
EYYKGRGFELTNVTGYKYGNKVTFIDNSQQIDVTLTGNEKLTVKDDDEVSNVDVFWR 
45 EGSDKSAITTSIGGITKTNGTQHKDTVQNVNLSVSKSTGQHTT SVTSEYYSIYKEEIS 
LKELDFKLRKHLIDKHDLYKTEPKDSKIRITMKNGGYYTFELNKKLQPHRMGDTIDSR 
NIEKIEVNL" 



50 S. aureus strain MW2 taken from GenBank 

SEQ ID No: 26 - SSL1 

139025. .139709 
55 /gene=" (setl6) MW2ssll" 

CDS 139029. .139709 

/gene=" <setl6) VMZssll" 
/note="ORFID:MW0382 

exotoxin homolog [Genomic island nu Sa alpha2] 
60 /codon_start-l 

/transl tableau 
/protein id=" BAB94247 . 1 " 
/db_xref-"GI : 21203547" 

65 / 1 r ans la t i on= " MKFKAI AKAS LALGMLATGVI T SN VQS VQAKTE VKQQSE ADLKL 
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YYNGPSFEYKKVTGYGFIEGKDRFIDFIYNGQYNKISLVGSDKDKYNEEVNPDIDVFV 
VREGNGRQADNHS I GG IT KTNRG V Y YD Y IHT P I LE IKKGKEE PQ S S L YQ I YKE D I S I»K 
ELDFKLRKQLISQSGLYSNGLKQGQITITMNDGTTHTIDLSQKLEKERMGESIDGRQI 
QKILVEMK" 

5 

SEQ ID No: 27 - SSL2 

gene 139995 14 0690 

/gene=" (setl7) mZssl2 n 
10 CDS 139995. .140690 

/gene=" (setl7) Mti2ssl2" 
/not e* " ORFI D : MWO 3 8 3 

exotoxin homolog [Genomic island nu Sa alpha2]' 
/ codon_start=l 
15 /transl tableau 

/protein id=" BAB9424 8 . 1 " 
/db xref-"GI: 21203548" 



/ 1 rans lat i on= "MKMKS I VKI SLLLGILATGVNTTTEKPVHAEKKPI VI SENSKKL 
20 KAYYTQPSIEYBCNVTGYISFIQPSIKFMNIIDGNSVNNIALIGKDKQHYHTGVHRNLN 
IFYWEDKRFEGAKYSIGGITSANDKAVDLIAEARVIKADHIGEYDYDFFPFKIDKEA 
MSLKEIDFKLRKYLIDNYGLYGEMSTGKITVKKKYYGKYT FELDKKLQEDRMSDVINV 
TDIDRIEIKVRKA" 



25 SEQ ID No; 28 - SSL3 



gene 140981 142051 

/gene=" (setl8) MW2 ssl3 u 
CDS 140981. .142051 

30 " ~ /gene=" (setl8) MW2ss23 " 

/note="ORFID:MW0384 

exotoxin homolog [Genomic island nu Sa alpha2] " 
/codon_start~l 
/trans l_t ab 1 e=ll 
35 /protein id=" BAB94 24 9 . 1 " 

/db xref«"GI: 2120354 9" 



/translation="MKMRTIAKTSLALGLLTTGAITVTTQSVKAEKIQSTKVDKVPTL 
KAERLAMINITAGANSATTQAANTRQERTPKLEKAPNTNEEKTSASKIEKISQPKQEE 
40 QKTLNISATPAPKQEQSQTTTESTTQQTKMTTPPSTNTPQPMQSTKSDTPQSPTIKQA 
QTDMTPKYEDLRAYYTKPSFEFEKQFGFLLKPWTTVRFMNVIPNRFIYKIALVGKDEK 
KYKDGPYDNIDVFIVLEDNKYQLPCKYSVGGITKTNSKBCVNHKVELSITKKDNQGMISR 
DVSEYMITKEEISLKELDFKLRKQLIEKHNLYGNMGSGTIVIKMKNGGKYTFELHKKL 
QEHRMADVI DGTN I DN I EVN IK " 

45 

SEQ ID No: 29 - SSL4 



gene 142416 143363 

/gene*" <setl9) MW2ssI4 " 
50 CDS 142416. .143363 

/gene=" (setl9) MW2ssI4" 
/note="ORFID:MW038 5 

exotoxin homolog [Genomic island nu Sa alpha2] M 
/codon_start=l 
55 /trans l_table=ll 

/protein id=" BAB94250 . 1 " 
/db xref«"GI: 21203550" 



/trans lation="MKITTIAKTSLALGLLTTGVITTTTQAANATTPPSTKVETPQQV 
60 ANATTPS STKVEAPQQAANATTPS STKVEAPQSKPNATTPS STKVEAPQQAANATTPP 
SSNVDTSPPQSPTTKQVPTEINPKFKDLRAYYTKPSLEFKNEIGIILKKWTTIRFMNV 
VPDYFIYKIALVGKDDKKYGEGVHRNVDVFWLEENNYNLEKYSVGGITKSNSKKVDH 
KAGVRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQLIEKNNLYGNVGSGKIV 
I KMKNGGKYT FE LHKKLQENRMADV I DGTN I DN I EVN IK" 
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SEQ ID No: 30 - SSL5 

gene 143727 144431 

/gene="set20 MW2ssI5" 
5 CDS 143727. .144431 

/gene«=" (set20) MW2ss5 " 
/note="ORFID : MW038 6 

exotoxin homolog [Genomic island nu Sa alpha2]' 
/codon_start=l 
10 / trans 1 table=ll 

/protein id=" BAB94251 . 1 " 
/db_xref ="GI : 21203551 " 

/ 1 r ans 1 a t i on- " MKMAAI AKASLALG I LATGT ITS LHQT VNAS EHEAKYENVTKD I 
15 FDLRDYYSGASKELKNVTGYRYSKGGKHYLIFDKHQKFTRIQIFGKDIERFKARKNPG 
LDIFWPCEAENRNGTVFSYGGVTKKNQDAYYDYINAPRFQIKRDEGDGIATYGRVHYI 
YKEEISLKELDFKLRQYLIQNFDLYKKFPKDSKIKVIMKDGGYYTFELNKKLQTNRMS 
DVIDGRNIEKIEANIR" 

20 SEQ ID No: 31 - SSL6 

gene 144877 .. 145575 

/gene=" <set21) MW2ssl6"" 
CDS 144877. .145575 

25 ~~ /gene=" (set21) W!2ssl6" 

/not e= " ORFI D : MW0 3 8 7 

exotoxin homolog [Genomic island nu Sa alpha2]* 
/codon_start=l 
/transl table=ll 
30 /protein id=" BAB94252 . 1 " 

/db_xref="GI: 21203552" 

/ 1 r ans 1 a t i on= " MKLKALAKAT LVLGL LATG V I TTE S QT VKAAE S T QGQHN YKS LK 
YYYSKPSIELINVDGLYRQHLTDKGAYVWKNLKDYYIGLLGEDSKKFKSDVYGDLDAF 
35 LVIEEEPVKGRQYSIGGISKTNSKEFKEREVDVKVTRKADRDTTSTKDSKFKITKEEI 
SLKELDFKLRQKLMKEENLYDAINHRKGKIVVKMEDDKFYTFELTKKLQPEiRMGDTID 
GTKIKEINVELEYK" 



40 



SEQ ID No: 32 - S5L7 



gene 145997 146692 

/gene=" (set22) MW2ssl7" 
CDS 145997. .146692 

/gene-" (set22) MW2ssI7" 
45 /note="ORFID:MW03B8 

exotoxin homolog [Genomic island nu Sa alpha2]" 

/codon_start=l 

/ trans l_table=ll 

/protein id=" BAB94253 . 1 " 
50 /db_xref="GI: 21203553" 

/trans la tion= "MKLKTLAKATLALGLLTTGVITSEGQAVQAKEKQERVQHLYDIK 
DLHRYYSSESFEFSNISGKVENYNGSNVVRFNQENQNHQLFLSGKDKDKYKEGLEGQN 
VFWKELIDPNGRLSTVGGVTKKNNQSSETNTPLFIKKVYGGNLDASIESFLINKEEV 
55 SLKELDFKIRQHLVJECNYGLYKGTTKYGKITFNLKDGEKQEIDLGDKLQFEHMGDVLNS 
KDIQNIAVTINQI" 

SEQ ID No: 33 - SSL8 

60 gene 147031 .. 147729 

/gene=" (set23) MW2sslS" 
CDS 147031. .147729 

/gene=" (set23) milssll " 

/note«"ORFID : MW0 38 9 
65 exotoxin homolog [Genomic island nu Sa alpha2]" 
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/codon_start=l 
/transl_ table— 11 
/protein id=" BAB94254 . 1 " 
/db xref-"GI: 21203554" 



10 



/trans lation= "MKFTAIAKAIFVLGILTTSVMITENQSVNAKGKYEKMNRLYDTN 
KLHQYYSGPSYELTNVSGQSQGyYDSNVLLFNQQNQKFQVFLLGKDENKYKEKTHGLD 
VFAVPELVDLDGRIFSVSGVTKKNVKSIFESLRTPNLLVKKIDDKDGFSYDEFFFIQK 
EEVSLKELDFKIRKLLIKKYKLYEGAADKGRIVINMKDENKYEIDLSDKLGFEEUMADV 
INSEQIKNIEVNLK" 
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20 



25 



30 



SEQ ID No: 34 - SSL9 
gene 
CDS 



148108. .148806 

/gene=" (set24) MW2ssl3" 

148108. .148806 

/gene=" (set24) MW2sslS" 

/note="ORFID:MW0390 

exotoxin homolog [Genomic island nu Sa alpha2] 

/codon__start-l 

/transl table-11 

/protein id-" BAB94255 . 1 " 

/db xre£="GI: 21203555" 



/ transl at ion= "MKFTALAKATLALGILTTGVFTTESKAVHA3KVELDETQRKYYIN 
MLHQYYSEESFEPTNISVKSEDYYGSNVLNFKQRNKAFKVFLLGDDKNKYKEKTHGLD 
VFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPSLQVKKIDPKHGFS INELFFIQK 
EEVSLKELDFKIRKMLVEKYRLYKGASDKGRIVINMKDEKKYVIDLSEKLSFDRMFDV 
MDSKQIKNIEVNLN" 



SEQ ID No: 35 - SSL10 



35 



40 



gene 
CDS 



149165. .149848 

/gene-" (set25) MW2ss2I0" 

149165. .149848 

/gene-" (set25) MW2ssI10" 

/note="ORFID : MW03 91 

exotoxin homolog [Genomic island nu Sa alpha2]" 

/codon_start=l 

/ 1 rans l__t abl e= 11 

/protein id=" BAB94256 . 1 " 

/db xref="GI: 21203556" 



45 



50 



55 



60 



/ transl at ion= "MKLTAIAKAALALGILTTGTLTTEVHSGHAKQNQKSVNKHDKEA 
LYRYYTGKTMEMKNISALKHGKNNLRFKFRGIKIQVLLPGNDKSKFQQRSYEGLDVFF 
VQEKRDKHD I FYTVGGVI QNNKT SG WS AP I LN I SKEKGE DAFVKG YPY YI KKEKI TL 
KELDYKLRKHLIEKYGLYKTISKDGRVKISLKDGSFYNLDLRSKLKFKYMGEVIESKQ 
IKDIEVNLK" 



SEQ ID No: 36 



SSL11 



gene 
CDS 



153324. .154016 

/gene=" (set26) MW2ssI2I" 

153324 . .154016 

/gene=" (set26) mZsslll" 

/note="ORFID : MW03 94 

exotoxin homolog [Genomic island nu Sa alpha2] " 

/codon_start=l 

/transl tableau 

/protein id=" BAB94259 . 1 " 

/db xref="GI: 21203559" 



/ translation s= ="MKLKNIAKASLALGILTTGMITTTAQPVKAIEQSRLSVTSKDTQ 
65 ELKKYYSGTGYNFQNVSGYREGNKMNIIDGPQLNWTLLGTDKERFKDDEDYEGLDVF 
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WREGSGKHADNISIGGITKTNPCNQYKDPVQNVNLLTSKSNGQNTASVTSEYYSINKE 
EISLKELDFKLRKQLIDKHDLYKTEPKDSKIKVSMKNGGYYTFELNKKLQPHRMGDTI 
DSRNIKKIEVNL" 



S. aureus strain NCTC8325 taken from uncompleted genome 
project at Oklahoma University via http://pedant.gsf.de 



SEQ ID No: 37 



10 

45955 CTGATT GTCTACATTT ATATTAGAGA TTAAAGCGTT TTGATTTTCA GATGCTTCTG CTTGTCCGTT TGTC 
4 6025 ATAATA T AT AAC ATT G TAATTGCAAC AGATACTAAA CCGACTTTCA TTTTACGTAA CTTAAAATTT TCCC 
4 6095 TCATGA TATGCTCCCT CGAATAATTT T AT AAATT C A CTTCAATTTT TTCTATATTT CTGCCATCAA TAAC 
46165 ATCACC CATACGGTGT GTTTGTAACT TTTTATTCAA TTCAAATGTG TAGAACCCAC CATCTTTCAT AGTA 

15 4 6235 ATTCGA ATTTTACTGT CTTTAGGTTC TGTCTTATAA AGGTTATGTT TATCAATTAA ATGCTTTCTT AATT 

4 6305 TAAAAT CAAGTTCTTT TAATGAAATT TCTTCTTTAT TAATTGTATA TGTAGATGAT GTTGACGTTG ATGT 
4 6375 AACACT ATCGATGTTT TTAGTAATTA TTAAATTTAC ATCTTTTACT TTATCAATAT AATTTGAACC GTTT 
4 6445 GTTTTA GTAATACCAC CAATTGAAGC TGTATTACCA GATCTATCAG AATTTTCTCT TACAACAAAT ATAT 
4 6515 CTACAT TAGAAATATC TTCACCAAAA TTTTGCTTTT CATTCCCAGT TAAAGTTACA TCTATAAGTT GATA 

20 4 6585 ATTAGG AGTAAACGTC ACTTTTCCTT CCTCTTTATA TCCTGACTGA TTTGTATACT CAAAGAACGG TCTA 

4 6655 TTATAA TATTCACTCA AGTCTTGAGT AGCTTGTGAT CTAACCTCTA ATGTACTXGC TTTTACTGGC TGAG 
4 6725 CAGTAG TTGTAATCAT CCCTGTTGTT AAAATCCCTA GTGCTAAACT TGCTTTAGCA ATATTTTTTA ATTT 
4 6795 CATAAT TCTATGCTCC CAATTTTTAG TCTATTTGAT TTATTCTATT ACGCAATACG AACAATCCTC ATTC 
4 6865 ATT ATA ATGAGCAATG GTCGTGCTTC ACATTAAACT TACTTTAACT AAAAATTAAT CATTATTAAG TAAT 

25 4 6935 TTAATA AATAGTTAAC TAATATCATT CAATTCCTAT CGATCTACTC TCTTTTATTT ACGAATAACA CTTT 

47005 ATCTCA ACTTAATCTT TATTTAACTA CTTATTCTGT ACACAATTTC GACACAAAAA AGACACTGCG CTTT 
47075 CACAGT GCCTAAATTT AATGTTCGTT TTTAATTCAC TTTACTTTTA CAACTTTTAT GTTAAATTAT ACAA 
47145 ATTTTA TTCTTATGCA ATCTATGTAT TTATCAAAGT TATAAGAACA TTTTTTGTAA AAAGGATTGT TTCT 
47215 CTTTTT CTAATATTTC AATTTTCATT TTCTGTTTAC TTATCAATAT ATCCATTTTT TTAAAGAAAT CACC 

30 47285 TATCTT TTCTTGTTCC TCCAATACAG GTATATCTAT ATT TAT ATTT TTTAATTGTT TATATTTTAA GTTC 

47355 CATGTA TCTGATGTTA ATCCTTGTGA ATTAATTTTA AATTTATGAA TCATTCTATG TGTTTTAAAC TTAT 
47425 ATCCAA TAAATAATGA GCTAGTATTT TGTGTTGGAT AAAGCACAGT ATATGCAGGG CTAACAATCC CATT 
47495 AT AATT TGATTTACCA CTAGCCCCTT GCCACATTCT CATAGAATTA TATGCAATAT CATTTTTCCT AACT 
475 65 ACTTTA TAATTACTTT TATCTTTACT TGAATTATCT TTTCTATCCA ATTCACTAAA TTTTATAATG CCAC 

35 47635 T ATTT A TAGTTACTGA AAGCATTTGC CCTTTGTCAG AACGTTCGTT TCTCTCTTTT AAATATTTTT CTAT 

477 05 TTTGCT ATTTTCCCAA TCTGGATAAT CTTCACCATT CTCATCTTTG AATCGCAGTT CCTGTGAGAA AATT 
47775 TTCTGC ATATAGCCTT TTTTCTGTTG TTGAAGTAAT TCAAGCTTTT GTTCTTCTAA TTCAATTTGT CGGT 
47845 CGAGTT TGCTGAAGAA CTTGCCTATT TTTTGCTGTT CTTCTAAACT TGGATATTTA ATTAGAATAG TAAA 
47915 AAAATC ATTCACAGAA ACATTTAATA ATCCGTGATT TCTTGCACCC TCAACTGCAA TTCCAGAAAC TTCT 

40 47985 CTAT AC CAGTGTGTCG AATCAAAATA TGCTTCCATG AAGTCTTTAG ACATTTCACT TT TAAT AG AA AAAC 

48055 AAATAT ACAAAGAGGA CAATACACCA CTATCATATC TAGTTAATCT TTTAATAGCC CCTAATGGGT ATCC 
4 8125 ATTAGA ATAACTTTTG TTATACGCGA ATTCTCCATT CTTTATTAGT GTATAATTTT CTAGATTTTT CGAC 
48195 GAAACT GAT T T AC T AA AATATTCTGT TTGATCAATT AAACCT AAC T GTCCGGATAT TGTTAAAGGC TTTT 
48265 TCGATT CTAAGTTTTT ATTTTTCCTA ATTACTCTAT CTGTAAGATC CCCTAACTGC TTCTCTTCCC ATTC 

45 48335 GCCTTC AAACCCTGGG AACCTCAATT CTGGCACATT TTTCTTTTGT GTATTACTCA TCTTTCAACA CCCC 

48405 AAGTTC TTTCAGGTAT GCATTGATTT CTTGCTCAAT TTCTGCGATT TCTTTATCGA TATTTTTCAA ATCT 
48475 TGTTGG ACTTGATCTA AATCAATCGG TGCTTCTTCT TCGAATGTAT CGACATATCT CGGTATATTT AGGT 
48545 TGTAAT CGTTATCGGC AATCTCTTGT AGTGTCGCGC TGTAGCTATA TTTATCAATT GTTTCCTTAC GCTT 
48615 ATATGT GTCTATAATA CGTTCGACTT GGGCATCGCT TAAATGATTT TGATTTTTTC CTTTTTCAAA ATCA 

50 48685 TTGGAT GCATCGATAA ATAGTACGTT GTCGTCTTGT TGGCGACATT TTTTAAATAC TAAAATACAT GTTG 

48755 GAATAC TTGTCCCATA GAAAATATTC GCTGGCAAAC CAATCACAGC TTCTAAGTAG TTCTTTTCTT CAAT 
48825 TAAATA ACGACGAATG ACACCTTCTG CAGCACCTCG GAATAATACA CCATGTGGGA GTACAACGGC CATG 
48895 GTACCT TCATCGTCTA GGTAATGTAC CATG T GTTG A ATAAAGGCAA AGTCTGCTTT AGACTTAGGC GCAA 
48965 GTTTGC CGTAACCACT GAATCGTTCG TCATTTTCAA ACTTTGAATC TGCAGTCCAT TTCGCACTAT ACGG 

55 49035 TGGGTT CGCAATAACC GCATCAAATG TATTGCCTAA AAAGGCTGGG TTTTCCAATG TGTCATCATT ACGG 

49105 ATATCG AAGTTCTCAT AACGCACATC ATGTAATAAC ATATTCATGC GTGCTAAGTT GT ATGT AGTA TTGT 
4 9175 TACGTT CTTGACCGAA ATAACGATAC ACTTGTGTTT CTTTACCAAC AC GT AAC AAC AGTGAACCTG AACC 
4 9245 ACATGT TGGGTCATAC ACGTGACGTA ATT TAT CTTT ACCGTCTGTG ACAATCTTCG CCAGTATCTT AGAT 
49315 ACTTGT TGTGGTGTAT AGAACTCGCC TGCTTTTTTA CCCGCTGTCG CCGCAAAGCG CCCAATTAGG AATT 

60 49385 CATATG CAT C ACCT AA CATATCAATT TCCATGTCAC TGTGAACGAA TGGTAAGTCG TCAAGATTAA CCAT 

49455 GACTTT AGAGATTAAA GCAGTACGTT CTTTGACATT GTTACCTAGT CGC GTTG AAC TCAAATCCAT ATCG 
49525 CTGAAC AGACCGATAA AGTCATTTTC ACTTTCTTCA CCTAATGTTG ATGTTTCAAC TTTACGAATT GCCG 
49595 TCGCCA GGTGTTCGAT ATCGAAATCT TGCGTTTCAA TTTCACGAAT CATCGCACTG AATAAATCTT CTGG 
49665 CTCAAT GAAGTAACCG ACTTGGTCAA TTAATTCTGC TTTTAAGTCT TCACGGTATT CTTCGTCTGC CCAT 

65 4 9735 GCTTCT TGATACGTGA TGTCTTCACC TGACAAGGCA TCTGCATATT CTTGTTCCGC TTTTTCAGAT AAGA 

49805 AGCGAT AGAAAATCAA GCCTAAAATG TAATTACGGA ATTCACTCGC AT CCAT ATTC CCTCTTAAAT CATT 
49875 CGCAAT CGAC CATAAT TTTTTATGTA ATTCAGCTTG TTGCTGACGT TGTTTTTCAG TAATAGACAT GTGA 
49945 TTCCTC CGCCTTTGAA TAAGTAATTT ATCTCTTTGT G TAAT AG ATT TATTATAACA TTTGGTTATG TTGC 
50015 GATGTG GAGAATTTTG ATGTTGGTGG TGGAAATTTT ACCTTTATGG AGATTGGGAT AGGATTCATG ACAC 

70 50085 ACAAAC CAAAACGGCT TAACGACGAT ATCGATTGAT GCATCCTTTT TGTATAAGCT ATGCGCACCA ATTT 
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50155 ACATTT CACCGTTAAG 
50225 ATCTTT AATTTGTTTG 
50295 TTATAA AAACTGCCAT 
50365 ATTTTT CAATTAGATG 
50435 ATAAGG GTAACCTTTC 
50505 ACTCCA GATGTTTTAT 
50575 CTTGAA CAAAGAAAAC 
50645 TAAAAC TTGAATCTTA 
50715 TTTTTC ATTTCCATAG 
50785 TTTGAT TTTGTTTTGC 
50855 TAATGT CGCTTTTGCT 
50925 CATTAC GTAATTTGAA 
50995 AAATAT ACTTGAATTA 
51065 TTCATT TAGTTGTTTT 
51135 TACAAA AACAACTTAA 
51205 CATTTT GTCGTTAAGC 
51275 TTTTTA ATTTGCTTAC 
51345 CATGCT TCTTTTCGTC 
51415 TTTTTC GATTAAGAGT 
51485 AACAAC TCGTTTATCG 
51555 ATCCAA ACACTGATCT 
51625 TAATTC AGGTACTGCA 
51695 AAAAAT ACTTTAAAAG 
517 65 TAACAC TAATGTTTGT 
51835 GCGTTG TGTCTCATCA 
51905 AATATT CCTAATGCTA 
51975 TATTGG ATTGTTATTA 
5204 5 ATTAAC GCTCATTTAA 
52115 CTGATT TTTAATAGCA 
52185 AGCTAT TGGTTATAAA 
52255 AAAAAC ACGTATCAAA 
52325 CATTGA TTATTTCAAA 
52395 AAATCT AATTTATCAC 
524 65 TATCAG CTGACCCTTC 
52535 CAATGA CACTTCTTCC 
52605 TTAACT AGTAAGTTCG 
52675 CACTAA ATATTCTTCC 
52745 TTTGTA TTTATTTTCA 
52815 ACGTTA GAGTCATAAT 
52885 ATTGAT GTAACTTGTT 
52955 ATTTTC TGTTATCATT 
53025 TTCATA TTCATGTGCT 
53095 ATTATA AAGGGCATTG 
53165 AGCTAA TTAATATTCA 
53235 GCGTTA TAAAAC GTAA 
53305 GTATCT ATTCAACATT 
53375 AAATTT GTTTCAAAGT 
53445 TTGCAA TTTATCACCT 
53515 TTAGTC GTACCTTTAT 
53585 GTGAAA CTTCTTCTTT 
53655 AACAAA TAAATGTGTA 
53725 CTACCG TTTGGATCAA 
53795 CTTTAT CTTTACCTAA 
538 65 GTTATA ATTTTCAACC 
53935 TCTTTA ATATCATATA 
54 005 TAATCA CACCAGTAGT 
54 075 ATTCTC CCAATCTATT 
54145 CAATCT ACTATCATTA 
54215 ACGTTA AGTTAGTTAA 
54285 ATTCAC ACTTTCCCTC 
54355 TATAAG TATCAACCGT 
54 425 AAAGAA GCTAAGCAAT 
54 495 ATATTC TAGCTCAACA 
54565 AGTTTT TTTGTAAGTT 
54635 TATTAA CAGCACCATA 
54705 CGAGAT TTCTTCTTTA 
54775 GTTACT TTAACATCGA 
54 845 GTCTTC CATTAACAGT 
54915 TTTTTC AATATCTTTA 
54 985 TCTGTC ACTTTCTGTC 
55055 TTAACG ATTTATAATT 
55125 TACACC AGTAGCTAAC 
55195 TCCCAA TCTATTTATA 



CCGCTTCAAT TACTTTTAAT 
CTTTCTATGA CTTCCCCCAT 
CTTTCAAGCT AATTTTGACC 
CTTTCTCAAC TTATAATCCA 
ACAAAAGCAT CTTCACCCTT 
TATTCTGTAT TACACCACCA 
ATCTAACCCC TCATAACTAC 
ATACCTCTAA ACT TAAAAC G 
TCTTTCCAGT GTAGTATCGG 
ATGACCTGAA TGAACTTCTG 
AATGCTGTAA ATTTCATATT 
TCATACATTT TTATTATAAC 
TTCTTAATGA AAAACAAATA 
ACCGGATTAT CGTTTACTAC 
CAACAGTAAT GTATTTAACA 
CGCTTCTTAA GC TAT TAT GC 
TATC CATTAC ATCAAACATA 
TTTCATATTG ATAACAATTC 
TTTCTTATTT TAAAGTCCAG 
AAAAGCCATT TTTAGCATCA 
CACATTTTTC TTTGTTATAC 
AAGACATCAA GGCCATGTGT 
CTTTATTTCG TTGTTTAAAG 
TGGTTCAAAA CTTTCTTCAG 
AGTTCTACTT TCGCGTGACC 
ATGTTGCTTT AGCTATCGTT 
TTACGTAATT TGAACTATCT 
ATAAACTTGA AT CAT T ATT A 
ACATAAATAT TTTTTTGGTG 
GTGTACAAAA AAGCAGCTTA 
ATCATTAAAC ATGGGATTAT 
TTCACTTCGA TGTTTTTAAT 
TTAAATCAAT TTCATACTTA 
ATACAGTTTG TATTTTTTAA 
TTTTGAATAA AGAAAAATTC 
GCGTTCTTAG AGACTCAAAT 
ATCTAAATCT ACTAATTCTG 
TCTTTTCCCA ATAAAAACAC 
AACCTTGACT TTGGCCACTA 
TGTATCATAT AAACGGTTCA 
ACACTTGTTG TTAATATTCC 
CCAATCTTAA TATATTGGAT 
AGACAGACTG ACATTAACAC 
ATGGGCATCA CTTTCAAGAG 
AAAAAAGCCT AACGACAAGG 
TCATCGTTAA GCTGCTTCAA 
CACTTCAATC TTATTAATAT 
AAATCAATTT CTTGCTTTTC 
ATAAACCATA ATTTTT AACT 
ATTAATTGAA AATGAGTCAA 
TTAGTTTCAG AAGATTTGTT 
TTAATTCTTT TACCACAAAG 
TAAGAATAAT TGGTGATTTT 
TTACCACTAA TATTACTGAA 
AATGTTGTAC TCTCTCTTGC 
TAATAAGCCT AATGCCAATG 
TATAAATTTG TCTTAATATA 
TATAAG T GAG ATACTGCGCT 
TGAAATCCTA ACAGTATTAA 
AATTTAATTA TGTTTTCCAC 
TTTTCCAAAA ACTTGATCAT 
ATGTTATCGC TTAACTTCCC 
TTAATTTCTT TGATTTTGGT 
CGAAAGTATA AAACTTATCA 
TAATTTTTCT TCTTCCATCA 
GTAATTTTAA ATTTACTATC 
CTTCTTTACT AAATTCTTTA 
TTCCTCCTCG ATGACTAAAA 
CCAAGCAAGC CAACAAAATA 
TATACAAACC ATCAAGATTT 
GTGTTGACCT TGAGTTGATT 
AATCCCAATA CTAATGTTGC 
AATTTTGTCT TAATATATTT 



TATTCGTAAT GATTTACTTT 
ATATTTAAAT TTTAATTTAG 
CTACCATCTT TTGAGATTGT 
GTTCTTTTAG TGTTATTTTT 
TTCTTTTGAA ATATTTAATA 
ACAGTATAAA ATATATCGTG 
GCTGTTGAAA TTTACTTTTA 
TAAGTTGTTT TTACCATGTT 
TATAATGCTT CCTTGTCATG 
TTGTTAAAGT TCCTGTTGTT 
TATTTGCTCC AATCTTAATA 
GTGCATCGTT TCATACGATA 
TATTTTAATG TTCAGGTTAA 
AACTGATGAT GTTACTCTGT 
CAGTTCCTTA TATAAGGACA 
TATTAACTCG AACTAATTCA 
CGTTCAAAAC TTAATTTTTC 
TACCTTTATC AGACGTTCCT 
TTCCTTCAAT GATACTTCTT 
ACTTTTTTAA CTTGTAGACT 
CGCCAACGCT ATATATGCCA 
TTTTTCTTTA TATTTATTTT 
TTTAAAACGT TAGAGCCATA 
AATAGTATTG ATGTAGCATA 
AGTTTGACTT TCTGCTGTAA 
GTTAATTTCA TATTTTTTTG 
GTCTTTATTA TAAGGTGCAT 
ATGAAAAT C A AACGTATTTT 
AATCATTGTT CTTTCCGATT 
ATAACGTAGT CAATGTTCGA 
TAAACCGCTT CTTAAGCTTT 
TTGTTCACTA TTAATGACAT 
TTTTCATCTT TCATATTAAT 
TCAACAGTTT TCTTATTTTA 
ATCAATAGAA AAACCGTCTT 
ATTGATTTTA CGTTTTTCTT 
GTACCGCAAA GACATCTAAA 
TTGGAACTTT TGATTTTGTT 
ACATTT GTT A ACTCATAACT 
TTTTTTCATA CTTTCCTTTT 
TAATATAAAT ATCGCTTTAG 
TGCTTTTATT ACGTAATTTG 
TCGTTTAATT AAACTTAAAT 
CAATGTAAAT TTTTAGTATG 
TTGTTTTAAA TACGCTCCTT 
ATTTATTACT TTAGAGTCAT 
CCTTACTATT CAACACATCA 
TCCATCTTTC AAATTGATAG 
AAATGTTGTC TAATTTTGAA 
TTGATGCATC TAAATTTCCG 
ATTTTTCTTA GTCACACCAC 
ACATCTTTGC CTTCAATGCC 
GATTTTCTTG GTTAAAGCGT 
TTCAAAACTT TCTGATGAGT 
TTTTCTTTTG CTTGAACTGC 
TTGCTTTAGC TAACGTTTTT 
TTTTTATATG ATTAATTCAA 
ATGAATTAAC CTCTTTTTAA 
TTATTTTTTA TAACACAATA 
ATTGTTTCAT GTCACGAAAA 
AAACCCGCTC CTTTTTTCAT 
TACAATTACT AGTCTGCTTG 
AC CATC TATC GTGTCACCCA 
TCTTCCATTT TAACTACAAT 
ATTTTTTTCT TAATTTAAAG 
TTTAGACTTT TCCGATGATT 
CTATTTGTCT TACTTAAACC 
ATGCATCTTG CTTATCATGC 
ATCTTTTCGA TCCTTCCAAA 
TTTAACTCTA TACTTGGCTT 
CTGCCGCTTT TACTGTTTGA 
TTTAGCTAAC GTTTTTAATT 
TTATATGATT ATTTCAATTA 



AAGTTAACTT CAAT 
ATCTTAAATC AAGG 
TTTATAAAGT CCGT 
TCTTTTTTAA TGTA 
TTGGTGCACT GACA 
CTTATCTCTT TTTT 
TCATTTCCAG GCAG 
TCAAAGCACT AATA 
TTTATTTACT GACT 
AAAATTCCTA AAGC 
TATTGGATTG TTTT 
ATTAACCTCG ATTT 
ATGACTTTCA TTAT 
TCTGCTCTTT GAAT 
CAAATGCCAA TTCA 
AATTCACTTC AATA 
ACTTAAATCA ATTT 
TTATACAATC TATA 
CCTTTTGAAT AAAA 
TGGATTACTT ACAA 
CCTTTTATAT CTAT 
TATCGTCACC AAGT 
GTAATCTTCA CTTT 
TTGATATAAT ATTT 
ACACACCTGT AGTT 
CTCCAATCTT AATG 
CAAAACAGAC TGAC 
AACATTCGTA TTAA 
AGTTTTAAAT CTTA 
TATGTTTTTA TTAT 
CATTCTATAT ATAT 
CTGCCATACG CTCG 
AACAATTCTA CCTT 
AAATCAAGTT CCTT 
TATCGTCTAT TTTT 
TGTTACACCA CTAA 
CCATGTGTTT TTTC 
GGTTAAAAAG CAAA 
AGGTCCTGAA TAGT 
GCATTAACCG ATTG 
CTATCACTGT AAAT 
AATAATACAC CTTT 
CATTGTTAAT GAAG 
CTCACTCATA ACCT 
AATGTAAGGT GTAC 
TGATTACTTA CTTT 
CCCATGCGCT CGAA 
TGATCTTACC GTAT 
ATCAAGTTCT TTCA 
CCATACACTT TATT 
CAACAGTAGA TAAT 
TTCTTTATAT TTCT 
ACAACGTTAG AACC 
AGTATCGATG TAAG 
TTGGCCTTCT GATG 
AATTTCACAG TACT 
TTACGTAATA CGAA 
CTATTTTTGA ACCA 
CTAATTTTTT AACT 
GGACAACGCG CGAC 
CATAACAAAA TAAA 
TCCAAAGATT ATTT 
TGCGATGCGG TTGT 
TTTACCTTTT CTAT 
TCCAACTCTT TTAA 
CATCAATTTT TCTT 
ACCAATTGAA TATT 
TCACCTTGAG GGTA 
CATATACTCC TTTA 
GCTATAGTAG TATT 
CTTTCTGTTG TTAT 
TCATAGTACT ATTC 
CGTAATACGA ACAA 
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552 65 TCTACT ATCATTATAT AAGC GTAATA CTAAGCTATG AATTAACCTC TTTTTAACTA TTTTTGAACC AATG 
55335 TTAAGC TAATTAATGG AATCCTAACA GCGTTAATCT ATTTTTTAAA CTTAACGCAG CTTTTTTAAC TATT 
55405 CACACT TTCCCTCAAT TTAACTATGT TTTCCACATT GTTTCATGTC ACGAAAAGGA CAACGCGCGA CTAT 
55475 AAGTAT CAACTATTTC CACAAGTTTT ATTGGTGTTT TTATTATTCA TCGATACGCT TCATTTTCAT CTCT 
55545 CCAACA CAAAAAAGAA GCTAAGCAAC TTATGTTGCC TAACTCCTCT ATACTATCCA TATTTTACTA TTAT 
55615 C CAT AT TTCATTGAAT TATCTAATGT TGGCTTCTAT TTTTTCAATA TTTCTACCGT CAATGACGTC ACTC 
55685 ATGCGA TTTGTTTGTA ATTTTTTATT AAGTTCAAAC GTATAATAGC CGCCATCTTT C ATT AT C ACT TTTA 
55755 TCTTAC TATCTTTAGG AAACTTTTTA TACAGATCAA AATTTT GAAT TAAATACTGT CTCAATTTAA AGTC 
55825 GAGTTC TTTAAGTGAA ATCTCTTCTT TATAAATGTA GTGTACTCTA CCGTACGTAG CAATACCGTC ACCT 
558 95 TCATCT CTCTTGATTT GAAATCTTGG TGCGTTTATA TAATCATAAT AAGCGTCTTG ATTTTTCTTA GTGA 
55965 CACCAC CATATGAAAA CACTGTGCCA TTACGGTTTT CCGCTTCTTT AACAACAAAT ATGTCTAATC CCGG 
56035 ATTTTT ACGTGCTTTA AATCTTTCAA TATCTTTACC AAATATCTGT ACTCTTGTGA ATTTTCTATT TTTA 
56105 TCAAAG ATAAGGTAAT GCTTGCCACC TTTGCTATAA CGATAACCAG TAACATTTTT AAGTTCCTTA CTTG 
56175 CGCCAC TATAGTAATC TCTTAAGTCA AAGATATCTT TTGTCACATT TTCATATTTT GCTTTATGTT CACT 
56245 CGCATT TACAGTTTGA TGCAATGACG TTATTGTTCC TGTTGCTAAA ATACCTAATG CTAAACTTGC TTTC 
56315 GCAATT GCTGTCATTT TCATAGTTGT ATGCTCCATT CGTAAT TATT AGATTTGTTC GCTTACGTCT ATTG 
56385 AATCAT ACAGCTTTAT TATAGTTAGC GTATTTGACC TTTCACATTA AACCATGTTT AATAATCATT GAAT 
56455 CAT TAT TAAGTAAATT AAGGAATCTA TAATGTTCGT TAAATAAAAC TGATCCCGTT GTGCTTCACA CCCG 
56525 ATAGAT AGGGATTTAC AGATAAATTC AGGTCTCTTC CACGTCATAT TTGGACCCAT CGAAAATTCG GGTT 
56595 CTCAAA TCATCGAACA TAACAAAAGA AGCTAAGCAA CATGTAGGCC GTTGTCACTT AACTTCTTGT TTTT 
5 6665 CCGATG ACAGCTTCTA TTTAGAGAAT GTCATGATTA TTTTATATTC ACTTCAATGT TATCAATATT AGTG 
56735 CCATCT ATGACATCTG CCATGCGATT TTCTTGTAAT TTTTTGTGCA ATTCAAACGT GTACTTTCCA CCGT 
56805 TTTTCA TTTTAATAAC AATTTTACCT GAACCAACGT TACCGTACAG AT TATT TTTT TCAATAAGTT GTTT 
56875 TCTCAA TTTAAAATCA AGTTCTTTCA AGGAAATCTG TTCTTTAGTA AT C TTGAAT T CTGAAACATC ATGA 
56945 GAG ATT GT ACCT TTAT TATCTTCCTT AGTAATTCTT ACTCCTGCTT TGTGATCAAC TTTTTTACTA TTAC 
57015 TCTTTG TGATACCACC G AC AG AAT AT TTTTCCAGAT TGTAATTATT TTCTTCTAAA AC GAC AAAT A CATC 
57085 GAC ATT CCTATGTACT CCTTCACCAT ATTTTTTATC ATCTTTACCA ACTAAAGCAA TTTT AT AT AT GAAA 
57155 TAATCT GGGACAACAT TCATAAATCT TATTGTCGTC CATTTTTTTA AAAT AAT AC C AATCTCATTT TTAA 
57225 ATTCTA AACTTGGTTT CGTATAATAC GCTCTTAAAT CTTTAAATTT AGGATTTATT TCTGTTGGTA CTTG 
57295 TTTTGT GGTT GGC GAT TGTGGTGTGT CTGATTTAGT AGATTGCATT GGTTGTGGCG TGTTTGTTGA TGGA 
57365 GGTGTT GTCACTTTAG TTGAAGGCGG TGTTGTCGCA TTTGCTGTTT GTTGCGGTGC TTCTACTTTA GTTG 
57435 AGGGCG GTGTTGTCGC GTTTGGTTTT GATTGCGGTG CTTCTATTTT AGTTGAGGGC GGTGTT GAT T GTGG 
57505 TGCTTC CACTTTAGTG GAAGATAGTG TTGTCGCGTT TGCTGCTTGC GTTGTCGTTG TGATTACACC TGTT 
57575 GTTAAA AGGCCTAGTG CTAAACTTGT TTTAGCAATC GTTGTTATTT TCATAGTTGT ATGCTCCATT CGTA 

57 645 ATTATT AGATTTGTTC GATTACATTC ATTGAATCAT ACAGCTTTAT T ATAGAT GG C GTATTGCTCC ATTC 
57715 ACATTA AACCTTGTTT AACTATATTT GAAT CAT CGT TAAGTAAATT AAGAAATCCA TAATGTTCGT TAAA 
57785 TAAAAA TGATTTTGAT GTGATTCAAC ACTTGGCACA TTTGAAGTTT CGTCACTTTT AAGACATAGA AATG 
57855 CCACTT TTACAAACAA AT GAAT AT TC GTCTTTTTAC ATCATTACGC ATAATAAAAG AAGCTAAGCA ACAT 
57925 GTAAAC CGTTGTCACT TAACTTCTTG TTTTTCCGAT GACAGCTTCT ATTTAGAGAA TGTCATGATT ATTT 
57995 TATATT CACTTCAATG TTAT C AAT AT TAGTGCCATC TATGACGTCT GCCATACGAT GCTCTTGCAG TTTT 
58065 TTGTGT AATTCAAACG TATATTTCCC ACCGTTTTTC ATTTTAATAA CGATTGTTCC TGAACCCATG TTAC 
58135 CGTAAA G ATT AT GTTT TTCAATAAGT TGTTTTCTCA ATTTAAAATC AAGCTCTTTC AAGGAAATCT CTTC 
58205 CTTAGT AATCAT GT AT TCTGAAACAT CGCGTGAAAT CATACCTTGA TTATCTTTTT TAGTAATGCT TAAT 
58275 TCTACT TTGTGATTAA CTTTTTTACT ATTAGTCTTC GTGATGCCAC CGACAGAATA TTTTTTCAAT TGAT 
58345 ATTTAT TGTCTTCTAA AACGATAAAT ACATCGATAT TATCGTAAGG TCCATCTTTA TATTTTTTCT CATC 
58415 TTTTCC AACTAAAGCT ATTTTATAGA TGAACCTATT TGGAATAACA TTCATAAACC TAACCGTCGT CCAT 

58 485 GGTTTG AGCATAAATC CAAACTGCTT TTCAAATTCA AAACTCGGTT TTGTATAATA CGCTCTTAAA TCTT 
58555 CATATT TAGGAGTCAT ATCTGTTTGT GCTTGTTTTA TGGTTGGAGA TTGTGGTGTG TCTGATTTAG TAGA 
58 625 TTGCAT TGGTTGTGGC GTGTTTGTTG ATGGAGGTGT TGTCACTTTA GTTTTCGGCG TTGTGGATTC GGTT 
58 695 GTCGTT TGTGATTGTT CTTGTTTAGG CGCTGGCGTT GCTGATATAT TAAGCGTTTT CTGCTCTTCT TGTT 
58765 TAGGTT GTGATATTTT TTCTATTTTG GAAGCTGAGG TTTTTTCCTC ATTAGTATTT GGTGCCTTTT CGAG 
58835 TTTAGG CGTGCGTTCT TGTCTTGTGT TAGCTGCTTG TGTTGTCGCT GAATTTGCAC CTGCTGTTAT GTTT 

58 905 ATCATT GCTAATCGCT CTGCTTTAAG CGTTGGTACT TTGTCAACTT TAGTTGATTG TATTTTTTCT GCTT 
58975 TGACCG ATTGCGTCGT TACT GT AAT T GCGCCTGTTG TTAAAAGCCC TAGTGCTAAA CTGGTTTTAG CAAT 
59045 TGTTCT CATTTTCATA ATTGTATGCT C CAAT CTAT A TTAT AT TCGA TTGTCTTTTT ACGTAATTTG AATC 
59115 ATACAA CATCATTATA GATGGCGTTC TAAGATAATC ACATTAAACC CCTTTTAACA AT T ATT GAAG TATT 
59185 ATTAAG TAATTTAAGC AAAAAATAAT GAGTGAGTAT GAGATTAATA TAGCGTTTCT ATGTGCCTTT GAAA 
59255 TAATTT TTAAGCATTA AAAAGAAGTT AAGCAACGTT TGATCGTCAC TTAACCTCTC TATT TC AAT T TCAA 
59325 CTTATT TCGTCATCAA GTATATGTGT TATGCTTTTA TAACTTTGAT TTCAATTCTA TCAATATCTG TGAC 
59395 ATT GAT AACATCGGAC ATACGGTCTT CTTGTAACTT TTTATCCAAT TCAAATGTAT AC TTTC CAT A GTAT 
59465 TTCTTT TTGACTGTAA TTTTTCCTGT ACTCATTTCA CCGTAAAGAC CATAATTATC AATAAGGTAT TTTC 
59535 TTAATT TAAAATCAAT CTCTTTCAAT GACATCGCTT CTTTATCTAT TTTAAATGGG AAAAAGTCAT AATC 
59605 AT ATTC ACCAGTATGA TCTTCTTTAA TAACTCTTGC TTCTGCTATT AGGTCGACAG CTTTATCGTT TGCA 

59 675 CTCGTG ATACCCCCAA TAGAGTACTT TGCACCTTCA AATCTCTTAT CCTCATTAAC GTAAAATATA TTAA 
59745 GATTAC GATGTACACC CGTATGATAA TGTTGCTTAT CTTTGCCAAT TAAAGCAATA TTAT T AAC AG AATT 
59815 ACCATC TATGATATTC ATAAATTTAA TACTTGGTTG AATGAAACTG ATATAACCTG TCACATTTTT ATAT 
59885 TCAATA C TAGGTT GAT TATAATAAGC TTTTAATTTT TTGCTATTTT CACT TATT AC AATAGGTTTC TTTT 
59955 CGGCAT GAACTGGTTT TTCCGTTGTA GT GTTT AC AC CTGTTGCTAA TATTCCTAAT AACAAACTTA TTTT 
60025 TGCAAT ATTTTTCATT TTCATAGTTG TATGCTCCAA TCTATTATAA TTAGATTGTT TTATTACGTA ATTT 
60095 GAATCA TACACCCATA TTATAGGAGC TGTATTCGGA TATTCACATT AACCTGTTTT TAACTATTCA TAAA 
60165 ATATGA TTAAGC TATT TAAGCAAAAG ATAATGCGTG GTGCGACTGT TTGTGTGAGT TTTGTAACTT TTCC 
60235 CCTCAA AAAAAGAAGC TAAGCAACAG TACGTCGCTT AACTTCCATA T ATT T CC ATA AC AAGC ACT A TTCA 
60305 ACATAG CGCTTTGTTG TTAGAAAGTA TTATTTCATT TCTACTAGAA TTTT ATT AAT CTTAGTGCCG TCGA 
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60375 
60445 
60515 
60585 
60655 
60725 
60795 
60865 
60935 
61005 
61075 
61145 
61215 
61285 
61355 
61425 
61495 
61565 
61635 



TTGACT 
TGTAAT 
AATCTA 
CAT TAT 
AACGCC 
GAGTTT 
CTGTGA 
TGGTTT 
TGTACA 
ATTTCA 
CATTAT 
ATTAAT 
TATACC 
GATTAA 
ATAATA 
TTTGAA 
TTAACA 
TTCACA 
AGATTA 



CACCCATACG 
TGTAATTTGA 
TAATCAAGTT 
CTTTCTTGAT 
ACCAATTGAG 
TCTCCGTCTT 
CTTCTAAATA 
ATTATAATAG 
TTCGATGTAA 
TAATTTTTAG 
ATTGAGGCGC 
AAAATGCTAA 
ATTTTTATTT 
AATTTTCAAT 
TTTACGTATT 
AACCATCAAA 
GGTGCTACAG 
TTGGTGTTAG 
TTTTAATCAA 



TTCTTTTTCA 
CCTTGTTTAA 
CTTTTAATGA 
TTCTAAAATT 
TAATTTGTTG 
TAAATTTATC 
GTGTTTACCC 
TGTTTTAACT 
TTACACCTGT 
CCCCAATCTA 
TATCCCCTTT 
CTGTTATTAA 
TAAAAACTGT 
GA.TATTAGTG 
TACTATACAC 
TAAGTATTAT 
GTCATTTAGG 
AAATGTTGAG 



AGTTTTTGAC 
GACCATTTGA 
GATGTCTTCT 
GGCGTATTGA 
CTTGTCTACT 
TTTATCAGAT 
TCATCAGTAT 
CTGATTCACT 
TGCTAACATT 
TTTAAATTTG 
CAAATTAAAC 
TCAATTCTTA 
AAAACTTTCA 
AT AC AC AT C A 
GTCGACAATA 
GTTATGTATA 
CACACATATT 
AAAGTTCCAG 



TTAAATCGAT 
ATACAAGCCG 
TTTGAAATGT 
TATAATCAAT 
GTCACCTTCT 
CCAAGTAAAG 
ATTTAAATCC 
TTGTTGTTTA 
CCCAATGCTA 
ATTTTTCAAT 
CTCATTTAAC 
TCCGTCGAGA 
GACACAAAAC 
ACTAAAACGC 
AATACAGATT 
TAAAGATTAG 
ACAAATCAAG 
AAGATTGGCG 



TGTATGTGTT 
TGTTGTTTAA 
AGTAAAAATC 
ATACTGCACA 
CTAAGGATAA 
TGATTCGAGA 
AGTCACATTT 
ACTTCTGCTT 
AACTTGCTTT 
TACGTAATAT 
TAAACTTGAA 
CTCTTTTTTA 
AGACATCATA 
GTCATTACGG 
AGAATTTTTT 
GAGATGAAGT 
CCATTGCAAA 
CGGAAAAGTT 



GTGCCATCAT 
TCGCACGTTC 
TTTAAGTACA 
CTATTTGATT 
ACACATCTAT 
ATGCTGTTGC 
TTACGCTCTA 
TCGCTTGTAC 
TGCTATCGCT 
G AAC AAT C T A 
TCATTGTTAA 
TACAAATCAC 
TTGTCGTAGT 
CATTTTCAAC 
ATAAAATCAA 
GAATGAATAT 
TCACATAGAT 
CCTGTTCGAC 



TCAT 
TCTT 
TCTT 
TTGT 
ATTT 
CCTA 
AAAT 
TGAT 
TTAA 
CCTA 
GGTA 
ACTT 
ACCT 
ATTA 
ATGT 
TATG 
CATT 
AATT 



SEQ ID No: 38 - SSL1 

gene 60331.. 61026 

/gene=" NCTC8325ssIl" 
CDS complement 60331.. 61026 

/gene="NCTC8325ssll 
/product-" (SET6) SSL1" 
/translation— "MGLKIMKFKAIAKASLAXjGMLA 

GFKYTDEGKHYLEVTVGQQHSRITLLGSDKDKFKDGENSNIDVFILREGDSRQATNYSIGGVTKSNSVQYIDYINT 
PILEIKKDNEDVLKDFYYISKEDISLKELDYRLRERAIKQHGLYSNGLKQGQITITMNDGTTHTIDLSQKLEKERM 
GESI DGTKINKI LVEMK " 



SEQ ID No: 39 - SSL2 

gene 59350.. 60045 

/gene=" NCTC8325ssl2" 

CDS complement 59350.. 60045 

~~* /gene="NCTC8325ss22 

/product-" <SET7)SSL2" 
/translation="MKMKNIAKISLLLGILATGVNTTTEKPVHAEKKPIVISENSKKLKAYYNQPSIEYKNVTGYI 
SFIQPSIKF^IIDGNSVNNIALIGKDKQHYHTGVHRNLNIFYVNEDKRFEGAKYSXGGITSANDK^VDLIAEARV 
IKEDHTGEYDYDFFPFKIDKEAMSLKEIDFKLP^YLIDNYGLYGEMSTGKITVKKKYYGKYTFELDKKLQEDRMSD 
VINVTDIDRIEIKVIKA" 

SEQ ID No: 40 - SSL3 

gene 57989.. 59059 

/gene-" NCTC8325ss23" 
CDS complement 57 989 ...59059 

/gene-"NCTC8325ss23 

/product-" (similar to SET8)SSL3" 
/translation="MKMRTIAKTSLALGLLTTGAITVTTQSVKAEKIQSTKVDKVPTLKAERLAMINITAGANSAT 
TQAANTRQERTPKLEKAPNTNEEKTSASKIEKISQPKQEEQKTLNISATPAPKQEQSQTTTESTTPKT.KVTTPPST 
NTPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFEFEKQFGFMLKPWTTVRFMNVIPNRFIYKIALVG 
KDEKKYKDGPYDNIDVFIVLEDNKYQLKKYSVGGITKTNSKKVNHKVELSITKKDNQGMISRDVSEYMITKEEISL 
KELDFKLRKQLIEKHNLYGNMGSGTIVIKMKNGGKYTFELHKKLQEHRMADVIDGTNIDNIEVNIK" 



SEQ ID No: 41 - SSL4 

gene 56698.. 57624 

/gene-" NCTC8325ss24" 
CDS complement 56698.-57624 

/gene="NCTC8325ss24 
/product-" (SET9) SSL4" 
/translation="MKITTIAKTSLALGLLTTGVITTTTQAANATTLSSTKVEAPQSTPPSTKIEAPQSKPNATTP 
PSTKVEAPQQTANATTPPSTBCVTTPPSTNTPQPMQSTKSDTPQSPTTKQVPTEINPKFKDLRAYYTKPSLEFKNEI 
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GIILKKWTTIRFMNWPDYFIYKIALVGKDDKKYG 

VRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQLIEK^ 

MADVIDGTNI 

DNIEVNIK" 

5 

SEQ ID No: 42 - SSL5 

gene 55630.. 56334 

10 /gene=" NCTC8325ss25" 

CDS complement 55630.. 56334 

/gene-"NCTC8325ss!5 
/product-" (SET10)SSL5" 
/ translation— "MKMTAIAKASLAIjGILATGTITSLHQTTO 
15 GYRYSKGGKHYLIFDKNRKFTRVQIFGKDIERFKARKNPGLDIFVVKEAENRNGTVFSYGGVTKKNQDAYYDYINA 
PRFQIKRDEGDGIATYGRVHYIYFCEEISLKELDFKLRQYLIQNFDLYKKFPKDSKIKVIMKDGGYYTFELNKKLQT 
NRMSDVIDGRNIEKIEANIR" 



20 SEQ ID No: 43 - SSL6 

gene 54489.. 55184 

/gene-" NCTC8325ss2 6" 
CDS complement 54489...55184 

25 /gene-"NCTC8325.s?sl 6 

/product**" (SET8) SSL6" 
/translation="MKLKTLAKATLVXGLLATGVITT^ 

QKVTDKGVYVWKDRKDYFVGLLGKDIEKYPQGEHDKQDAFLVIEEETVNGRQYSIGGLSKTNSKEFSKEVDVKVTR 
KIDESSEKSKDSKFKITKEEISLKELDFKLRKKIiMEEEKLYGAVNNRKGKIVVK^ 
30 T I DGTKIKE INVE LE YK " 

SEQ ID No: 44 - SSL7 

gene 53373...54068 
35 /gene-" NCTC8325i?s2 7" 

CDS complement 53373...54068 

/gene="NCTC8325ssl7 
/product-" (SET1-C) SSL7" 
/trans la tion="MKLKTIiAKATLALGLLTTGVITSEGQAVQAKEKQERVQHLYDIKDLHRYYSSESFEFSNISG 

40 

KVENYNGSNVVRFNQENQNHQLFLLGKDKEKYKEGIEGKDVFWKELIDPNGRLSTVGGVTKKNNKSSETNTHLFV 
NKVYGGNLDASIDSFSINKEEVSLKELDFKIRQHLVKNYGLYKGTTKYGKITINLKDGEKQEIDLGDKLQFERMGD 
VLNSKDINKIEVTLKQI " 



45 SEQ ID No: 45 - SSL8 

gene 52331...53029 

/gene-" NCTC8325ssI5" 
CDS complement 52331...5302 9 

50 /gene-"NCTC8325ssI8 

/product=" (SET12)SSL8" 
/translation-"MKFTVIAKAIFILGILTTSVMITENQSWAKGKYEKMNRLYDTNKLHQYYSGPSYELTNVSG 
QSQGYYDSNVLLFNQQNQKFQVFLLGKDENKYKEKTHGLDVFAVPELVDLDGRIFSVSGVTPCKNVKSIFESLRTPN 
LLVKKIDDKDGFSIDEFFFIQKEEVSLKELDFKIRKLLIKKYKLYEGSADKGRIVINMKDENKYEIDLSDKLDFER 
55 MADVINSEQIKNIEVNLK" 



SEQ ID No: 46 - SSL9 

60 gene 51253.... 51951 

/gene-" NCTC8325ssl5" 
CDS complement 51253.... 51951 

/gene- "NCTC8325 ssl9 
/product-" (SET13) SSL 9" 

65 
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/translation="MKLTTIAKATLALGILTTGVFTAESQTGHAKVELDETQRKyYrNMLHQYYSEESFEPTNISV 

KSEDYYGSNVLNFKQRNKAFKVFLLGDDKNKYKEKTHGLDVFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPS 

LQVKKVDAKNGFSINELFFIQKEEVSLKELDFKIRKLLIEKYRLYKGTSDKGRIVINMKDEBCKHEIDLSEKLSFER 
MFDVMDSKQIKNIEVNLN" 



SEQ ID No: 47 - SSL10 

gene 50204...50887 

/gene=" NCTC8325ssI2 0" 
CDS complement 50204...50887 

/gene="NCTC8325ssII 0 

/product^" (SET14) SSL10" 

/translation— "MK FT AIAKATLALGILTTGTLTTEVHSGH^ 

KHGKNNLRFKFRGIKIQVLLPGNDKSKFQQRSYEGLDVFFVQEKRDKHDIFYTVGGVIQNNKTSGWSAPILNISK 

EKGEDAFVKGYPYYIKKEKITLKELDYKLRKHLIEKYGLYKTISKDGRVKISLKDGSFYNLDLRSKLKFKYMGEVI 
ESKQIKDIEVNLK" 



SEQ ID No: 48 - SSL11 

gene 46120.... 46797 

/gene=" NCTC8325sslI2" 
CDS complement 46120.. 46797 

/gene="NCTC8325ssllJ 

/product=" (SET15) SSL11" 

/translation— "MKLKNIAKASLALGILTTGMITTTAQPVKASTLEVRSQATQDLSEYYNRPFFEYTNQSGYKE 
EGKVTFTPNYQLIDVTLTGNEKQNFGEDISNVDIFWRENSDRSGNTASIGGITKTNGSNYIDKVKDVNLIITKNI 
DS VT S T S T S S T YT INKEE I S LKE LD FKLRKHL I DKHNL YKTE PKD S KIRI TMKDGG FYT FE LNKKLQTHRMG DVI D 
GRNIEKIEVNL" 



SEQ ID No: 4 9 



1047303 TATGAAAA AG AAC AT CAT GAATAAATTA GTTTTATCAA CAGCATTGTT ACTTTTAGAA ACTACATCAA CA 
1047373 CAACTTCC TAAAACACCA ATCAGTTTTT CATCTGAAGC AAAAGCCTAT AATATCAGTG AAAACGAGAC TA 
1047443 ATATCAAT GAACTAATCA AATATTACAC TCAGCCGCAT TTTTCATTAT CTGGAAAATG GTTATGGCAA AA 
1047513 GCCCAATG GTAGCATTCA TGCAACATTG CAAACGTGGG TTTGGTATAG TCATATTCAA GTGTTTGGAT CC 
1047583 GAGAGTTG GGGAAACATT AATCAGTTAA GAAATAAATA CGTTGATATA TTTGGAACTA AAGATGAGGA CA 
10 47653 CAGTTGAA GGTTACTGGA CTTATGATGA AAC ATT TACT GGTGGTGTTA CGCCAGCAGC TACTTCATCT GA 
1047723 TAAGCCTT AT AG ACT ATT T TT AAAAT AT AGTGATAAAC AACAAACTAT CATCGGTGGA CATGAATTTT AC 
1047793 AAAGGAAA TAAACCAGTA TTAACTTTAA AAGAATTAGA TTTCCGTATT CGTCAAACAT TAATAAAAAA TA 
10 47863 AAAAGTTA TATAACGGAG AATTTAATAA AGGTCAAATT AAGATAACTG C TG AT GG AAA TAATTACACG AT 
1047933 TGATTTAA GTAAAAAGTT AAAATTAACT GACACAAACC GTTATGTTAA AAATCCTCGT AATGCAGAAA TT 
1048003 GAAGTCAT ACTCGAAAAA TCTAACTAAC CTATTACCTT TTGTAAATGC GGATAATTTC AATTATCTAA TT 
104 8073 AACCCCTT TTTATAATTA AACATTCCAA CAATACTCAA AGGAGAAATT CGAATGAACA ATAACATCAC GA 
104 8143 AAAAAATT ATTTTATCAA CAACATTGTT ACTATTAGGT ACAGCATCTA CACAATTTCC TAATACACCT AT 
1048213 CAATTCTT CATCTGAAGC GAAAGCTTAT TATATAAATC AAAACGAAAC TAACGTTAAT GAGTTAACTA AA 
104 8283 TATTACTC GCAAAAATAT TTAACCTTCT CTAACAGTAC GTTATGGCAA AAAGATAACG GTACGATTCA TG 
1048353 CAACGTTG TTACAGTTTT CTTGGTATAG TCATATTCAA GTTTATGGAC CTGAAAGTTG GGGCAATATC AA 
1048 423 CCAATTAA GAAATAAAAG CGTTGATATT TTTGGCATAA AAGACCAAGA AACCATTGAT TCTTTTGCAT TA 
1048493 TCTCAAGA AACGTTTACT GGTGGTGTTA CTCCTGCAGC AACATCTAAC GATAAACACT ATAAACTGAA TG 
1048563 TAACATAT AAAGATAAAG CAGAAACGTT TACTGGCGGA TTTCCAGTTT ATGAAGGCAA TAAGCCTGTT TT 
1048633 AACTTTAA AAGAATTAGA TTTTCGTATT CGTCAAACAT TAATTAAAAG T AAAAAATT A TATAATAATT CT 
1048703 TATAATAA AGGACAAATT AAAATAACAG GTGCAGACAA T AAC T AC AC A ATAGATTTAA GTAAAAGGTT GC 
1048773 CATC AAC T GATGCAAATA GATATGTTAA AAAACCTCAA AATGCAAAAA TTGAAGTTAT CCTCGAAAAA TC 
1048843 AAACTAAC AATAATAATG GAGTTAATAA AAATAATCGC AAATACTATA TTGACTTCGC TCACATTTAA AT 
1048913 TTCTTATT CCTCGTATCA TGATTCCTCT GAAAGGAGAT GTTCTAATGA GTAAGAACAT CACGAAAAAT AT 
1048983 AATTTTAA CG AC AAC ATT AT T AC TAT T A GGTACTGTAT TACCTCAAAA TCAAAAACCA GTATTTAGTT TT 
1049053 TACTCTGA AGCTAAAGCT TATAGCATTG GTCAAGATGA AAC T AAC AT C AATGAATTAA TTAAATATTA CA 
1049123 CACAGCCT CATTTTTCAT TTTCAAATAA ATGGCTATAT CAATATGATA ATGGAAACAT TTATGTTGAA CT 
1049193 TAAGAGAT ATTCATGGTC AGCACATATA TCTTTATGGG GCGCTGAAAG TTGGGGAAAT ATTAATCAGT TA 
104 92 63 AAAGATCG TTACGTAGAT GTGTTTGGAC TAAAAGACAA AGATACTGAT CAGTTATGGT GGTCTTATAG AG 
1049333 AG AC AT TT ACAGGTGGCG TTACACCAGC CGCAAAACCT TCTGATAAAA CTTATAATCT TTTTGTGCAA TA 
10494 03 CAAAGATA AACTACAAAC GATTATTGGT GCGCATAAAA TATACCAAGG CAATAAACCA GTATTAACAT TG 
1049473 AAAGAAAT CGATTTCCGT GCACGAGAAG CGTTAATAAA AAATAAAATA TTATATAACG AAAAT C GT AA TA 
104 9543 AAGGTAAG CTTAAGATCA CCGGTGGCGG TAATAACTAC ACTATTGATT TAAGCAAAAG ATTACATTCA GA 
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1049613 TCTAGCAA ATGTTTATGT TAAAAATCCT AATAAAATAA CTGTTGACGT CCTCTTTGAT TAGTATATGA AG 
1049683 G 



SEQ ID No: 50 - SSL12 

gene 1047304.. 1048029 

/gene-" NCTC8325ss22" 
CDS 1047304.. 1048029 

/gene="NCTC8325sslI2 
/product^" (similar to SA1011) SSL12" 
/translation="MKKNIMNKLVLSTALLLLETTSTQLPKTPISFSSEAKAYNISENETN 

SLSGKWLWQKPNGSIHATLQTWVWYSHIQVFGSESWGNINQLRNKWDIFGTKDEDTVEGYWTYDETFTGGVTPA^ 
TSSDKPyRLFLKYSDKQQTIIGGHEFYKGNKPVLTLKELDFRIRQTLIKNKKLYNGEFNKGQIKITADGNNYTIDL 
SKKLKLT DTNRYVKN PRNAE I E VI LEKSN " 



SEQ ID No: 51 - SSL13 

gene 1048124.. 1048849 

/gene=" NCTC8325ssI3" 
CDS 1048124.. 1048849 

/gene-"NCTC83255slJ3 

/product=" (similar to SA1010 from strain N315) SSL13" 
/translation="MNNNITKKIILSTTLLLLGTASTQFPNTPINSSSEAKAYYINQNETNVNELTKYYSQKYL 
TFSNSTLWQKDNGTIHATLLQFSWYSHIQVYGPESWGNINQLRNKSVDIFGIKDQETIDSFALSQETFTGGVTPAA 
TSNDKHYKLNVTYKDKAETFTGGFPVYEGNKPVLTLKELDFRIRQTLIKSKKLYNNSYNKGQIKITGADNNYTIDL 
SKRLPSTDANRYVKKPQNAKIEVILEKSN" 



SEQ ID No: 52 - SSL14 

gene 104B957.. 1049673 

/gene-" NCTC8325ss24" 
CDS 1048957.. 1049673 

/gene="NCTC8325ss2I 4 

/product-" (similar to SA1009 from strain N315)SSL14" 
/transJation="MSKNITKNIILTTTLLLLGTVLPQNQKPVFSFYSEAKAYSIGQDETNINELIBCYYTQPHF 
SFSNKWLYQYDNGNIYVELKRYSWSAHISLWGAESWGNINQLKDRYVDVFGLKDKDTDQLWWSYRETFTGGVTPAA 
KPSDKTYIJLFVQYKDKLQTIIGAHKIYQGNKPVLTLKEIDFRAREALIKNKILYNENRNKGKLKITGGGNNYTIDL 
SKRLHSDLANVYVKNPNKITVDVLFD" 



S. atureus strain EMRSA 16(252) taken from unpublished 
genome project at the Sanger centre via 
http : / /pedant .gsf.de 

SEQ ID No: 53 



122949 
123019 
123089 
123159 
123229 
123299 
123369 
123439 
123509 
123579 
123649 
123719 
123789 
123859 
123929 



GC CAGCTTTATC 
GA CGAAACCTCA 
CA TTTGCTTCAA 
AA CATAGTGATT 
AA GTTCTTATTA 
CT TGATTTGTAC 
TG TAATTGCAAC 
CT CGAATATTAA 
GA GGCTGTAATT 
GT CTTTAGGCTC 
TT TAATGAAATT 
GT TTAGTTATTT 
AC CTATTGAAAT 
TC ATAGTCATAA 
CA ATAATGTTCA 



AGTTGAAGCT 
ATTTCTTGAT 
CAAAACCCAA 
TGGTAAAGTA 
GATTCTACTG 
TTGGCTTCTC 
AGATACTAAC 
TATAAATCGA 
TTTTATTTAA 
TGTCTTATAA 
TCTTCTTTAT 
CTAGACCTAC 
ATTTTCAGCT 
TCTTTAAATC 
TTTTATCCTT 



TCTTCCACCT 
TATGAACAAT 
ACCGCCATTT 
GTTCCCGTTA 
TTTTTGTTTC 
ATTAGCCTCT 
CCGACTTTCA 
CTTCAATTTT 
TTCAAAAGTA 
AGTTCATGAT 
TAATTCTATA 
ATTATTTACG 
TGTTTACCTG 
TTTCTTTGTC 
TTCTCTATAA 



CTTCATCTTC 
TTTACGATTA 
TCTACTGAAT 
TCTTTTGAGC 
TTTTGAATTT 
GATGCTTCTG 
TTTTACGTAA 
TTCTATATTT 
TAGTAGCCGC 
TTTCAATTAA 
AGTTTCTGCT 
AAATCTTTAT 
ATCCTTCTCT 
TGTACCAAGT 
CCACTCACAT 



TTCATCTTCA 
TTTAAAGGGT 
CCGCACTTTT 
TCCAGGTTTA 
TGTTCAGTCT 
CTTGTCCGTT 
CTTAAAATTT 
CTACCATCAA 
CACCTTTCAT 
ATGTTTTCTT 
TGTCTTGTTG 
AATCGTTTTT 
GACTACAAAG 
AAAGTAACTA 
TTTGAAAATT 



CCTAAATCAG 
ACTCAAATTC 
CCCATCAACT 
ATAGTATCTA 
GTGAAACAAC 
TGTCATGATA 
TCCCTCATGA 
TTACATCACC 
AGTAATTCTA 
AATTTGAAAT 
CTGTATTATG 
ATTCGTCTTG 
ACATCTAACC 
CATTAAGTTG 
ATATCCTGTT 



GGCTTGAC 
TCCTTTGT 
GTTAATAA 
ATTTAACA 
TTTTGATT 
TATAACAT 
TATACTCC 
CATACGAT 
ATTTTACC 
CTAATTCT 
TCCTGTTG 
GTAATTCC 
CTTCATAA 
TGTCCCAT 
CCACTGTA 
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123999 GT ATTTTTTTAA TTCTTGCGTG TCGTTTGAAG 
124069 TG AGCAGTAGTT GTAATCATCC CTGTTGTCAA 
124139 AT TTCATAATTC TATGCTCCCA ATTTTTAGTC 
124209 AT TCATTATAAT GAGCAACGGT CGTGCTTCAC 
12 4279 TG ATTTAATAAA TAGTTAACTA ATATCATTCA 
124349 CT TTATCTCAAC TTAATCTTTA TTTAACTACT 
12 4419 CT ATCACAGTGC CTAAATTTAA TGTTCGTTTT 
124 4 89 AC AAATTTTATT CTTATGCAAC CTATGTGTTT 
12 4559 TT CTCTTTTTCT AATATTTCAA TTTTTATTTT 
124 629 CA CCTATCTTTT CTTGTTCCTC CAATACAGGT 
124 699 GT TCCATGTATC TGATGTTAAT CCTTGTGAAT 
124769 TT ATATCCAATA AATAAT GAG C TAGTATTTTG 

124 839 CA TTATAATTTG ATCTACCACT AGCCCCTTGC 
12 4 909 AA CTACTTTATA ATTACTTTTA TCTTTACTTG 
124979 CC ACTATTTATA GTTACTGAAA GCATTTGACC 
12504 9 CT ATTTTGCTAT TTTCCCAATG TGGATAATCT 
125119 GA TTTTCTGCAT ATAGCCTTTT TTTTGTTGTT 
125189 CG GTCGAGTTTG CTGAAGAAGT CACCTATTTT 
125259 TT TTATTAATTG CGCTACCTGA TGTTAAAGCT 
125329 AA ATGAATTAGT AAAAAAGACA TATCTTTTAA 
125399 CC TCTTAATACA AATCCACTAA ACACAGTATT 

125 4 69 CA CCAATTACCT CACTAGTCCT TGTAAAAAAA 
125539 GC TATTCACATT AACTTTTCCA G T C AG AT TAT 
125609 TT AACAATCGAC GATCCTGAGC CAAAATATTC 
125679 CG CCAACCTTCT TCTCTTCCCA TTCGCCTTCA 
1257 49 TG TATTACTCAT CTTTCAACAC CCCAAGTTCT 
125819 TT CTTTGTCGAT ATTTTTCAAA TCTTGTTGGA 
125889 TC GACATATCTC GGTATGTTTA AGTTGTAATC 
125959 AC TTATCAATTG TTTCCTTACG CTTATATGTG 
12 6029 TT GGTTTTTTCC TTTTTCAAAA TCATTGGATG 
12 6099 TT TTTAAATACT AAGATACATG TTGGAATACT 
12 6169 CT TCTAAGTAGT TCTTTTCTTC AATTAAATAA 
12 6239 AC CATGTGGGAG TACAACGGCC ATGGTACCTT 
12 6309 AA GTCTGCTTTA GACTTTGGCG CAAGCTTGCC 
12 6379 CT GCTGTCCATT TCGCGCTATA TGGTGGGTTC 
12 6449 AT TTTCCAACGT AT CAT C ATT A CGGATATCGA 
126519 CG CGCTAAGTTG TATGTGGTAT TGTTACGTTC 
12 6589 CG CGTAACAGTA ATGAACCTGA ACCACATGTT 
12 6659 GA CAATCTTCGC CAGTATCTTA GATACTTGTT 
12 6729 GC CGCAAAGCGC CCGATAAGAA ATTCGTATGC 
126799 AT GGTAAGTCAT CAAGGTTAAC CATAACTTTG 
126869 AC GCGTTGAACT TAAGTCCATA TCGCTGAACA 
12 6939 GA AGTTTCAACT TTACGAATTG CCGTCGCCAG 
127009 TC ATCGCACTGA ATAAATCTTG TGGCTCAATG 
127 079 TT CACGGTATTC TTCATCTGCC CATGCTTCTT 
127149 TC TTGTTCCGCT TTTTCAGATA AGAAGCGATA 
127219 CA TCCATATTCC CTCTTAAATC ATTCGCAATT 
127289 TT GTTTTTCAGT AAT AG AC AT G TGATTCCTCC 
127359 TT ATTATAACAT TTGGTGGTGT CGCGATGTTG 
127429 GA GATTGGGATA GG AT TC AT AA CACGCAAACA 
127499 TG GTAAGGTATA CGCACCAATT TACTTTTCAT 
127569 AT TTACTTTAAA TTCACTTCAA TATCCTTAAT 
127639 TT AATTTAGTTC TTAAATCAAG GTTATAAAAG 
127709 TG AGAGTGTTTT ATAAAGACCG TATTTTTCAA 
127779 GA TATTTCTTCT TTTTTAATAT CATAAGGGTA 
127849 CA TTTAATCTTG GTGTGCTGAC AAAGCCGGAT 
127919 TA TGTCGTGCTT GTCTCTTCTT TCTTGAACAA 

127 989 TT ACGGTACTCA TCTCCAGGCA ATAATACTTG 
128059 TA CCATGTCTCA AAGCATTAAT GTTTTTCATT 
128129 TT TGTCATGTTT GTTTACTGAC TTTTGATTTT 
128199 CC TGTTGTTAAA ATCCCTAATG CTAATGTTAC 
128269 AT CTTAATGTAT TGGATTGTTA TTATTACGTA 
128339 AA CGCGATAATT AACCTCGATT TAAATATGCG 
128409 CC AGGTTAATTA ACTTTCACGC TTTACATGTC 

128 479 GT TATTTCTACT CTTTGAAACG CAAAAAACAA 
128549 AA GGTACGGGTG TCAATGCACA TTTTATCGTT 
128619 AA TTCAAATTCA CTTCAATATT TTTAATCTGC 
128 689 TT TTTCACTTAA ATCAATTTCA TGTTTCTTTT 
128759 GC GCCTTTATAC AATCTATATT TTTCGACTAA 
128 829 CT TCTTCTTTTT GAATAAAGAA CAACTCTTTT 
128899 CA GTCCCGGATG ACTTACATAG CCAAACACGG 
128969 AT GCCGCCTTTA GTATCTATTA ATTCAGGTAC 
12 9039 TA TTTCTATCGT CGCCAATTAG AAATACTTTG 



TAACTGATAA TCTGCTTTGC TCACTTGCTT TTACTGGC 
AATCCCTAGT GCTAAACTTG CTTTAGCAAT ATTTTTTA 
TATTTGATTT ATT CT ATT AC G C AAT AC G AA CAATCCTC 
ATTAAACTTA CTTTAACTAA AAATTAATCA TTATTAAG 
ATT CC TAT CG ATCTACTCTC TTTTATTTAC GAATAACA 
TATTCTGTAC ACAATTTCGA CACAAAAAAG ACACTGTG 
TAATTTACTT TATGTTTATA ACTTTTATGT TAAATTAC 
ATCAAAGTTA TAAGAACATC TTTTGTAAAA AGGATTGT 
CTGTTTACTA ATCAATATAT CCATTTTTTT AAAGAAAT 
AT AT CTATAT TTATATTTTT TAATTGTTTA TATTTTAA 
TAATTTTAAA TTTATGAATC ATTCTATGTG TTTTAAAC 
TGTTGGATAA AGCACAGTAT ATGCAGGGCT AACAATCC 
CACATTCTCA TAGAATTATA TGCAATATCA TTTTTCCT 
AATTATCTTT TCTATCCAAT TCACTAAATT TTATAATG 
TTTGTCAGAA CGTTCGTTTC TCTCTTTTAA ATATTTTT 
TCACTATTCT CATCTTTGAA TCGCAGTTCC. TGTGAGAA 
GAAGTAATTC AAGCTTTTGT TCTTCTAATT CAATTTGT 
TCTCTGTTCT TTAGCCGAAA CAGGGTATAT GACCTTCA 
CTAGTTGTCA TAGAACTTTT TGTAATCATT TCTTTTCT 
AATTATTATT TATTAAATCA ATTCCTGATT TAGGCCGC 
TTCAGGGTCA TTTAAAATTA CAGACGGATA ACCTATTT 
ACATCACCCT TTTCAACAGA ATAATTTTTT AGTTCTTT 
TTGTATTTAA GCTCCTGTTA TTAAATACAT CTTTGAAG 
TTTTCCTTTA TTTAAACCAT TTTTAAATTC TAATAACT 
AACCCTGGGA ATCTCAACTC TGGCACATTT TTCGTTTG 
TTCAGGTATG CATTGATTTC TTGTTCAATT TCTGCGAT 
CTTGATCTAA ATCAATTGGC GCTTCTTCTT CGAATGTA 
GTTATCGGCG ATCTCTTGTA ATGTCGCGCT GTAGCTAT 
TCTATAATTC GTTCGACTTG GGCATCGCTT AAATGGTT 
CATCGATAAA TAATACGTTG TCGTCTTGTT GGCGACAT 
TGTCCCATAG AAAATGTTCG CTGGCAAACC AATCACGG 
CGACGAATGA TACCTTCTGC GGCACCACGG AATAAGAC 
CATCATCTAG GTAATGTACC ATGTGTTGAA TAAAGGCA 
GTATCCGCTG AAGCGTTCGT CATTTTCAAA TTTTGAGT 
GCAATAACCG CATCAAATGT ATGTCCTAAA AAGGCTGG 
AATTTTCATA TCGCACATCA TGTAACAACA TGTTCATG 
TTGTCCAAAA TAACGATACA CTTTTGCCTC TTTACCAA 
GGGTCATATA CATGACGTAA TTTATCTTTA CCGTCTGT 
GTGGTGTATA GAACTCGCCT GCTTTTTTAC CCGCTGTC 
ATCACCTAAC ATATCAATTT CCATATCACT GTGAACGA 
G AAAT T AAC G CAGTACGTTC TTTGACATTG TTACCTAA 
GTCCGATAAA GTCATTTTCA CTTTCTTCAC CTAGTGTT 
ATGTTCGATA TCGAAATCTT GCGTTTCAAT TTCACGAA 
AAGTAACCAA CTTGATCAAT TAATTCTGCT TTTAAGTC 
GATATGTAAT ATCTTCACCT GCCAAGGCAT CTGCATAT 
GAAAATCAAG CCTAAAATGT AATTACGGAA TTCACTCG 
GACCATAATT TTTTATGTAA TTCAGCTTGT TGCTGACG 
GCCTTTGCAT AAGTAATTTG TCTCTTTGTG TAATAGAT 
AGAATTTTGA TGTTGATGGT GGAAATTTTA TAT T CAT G 
AAAACGGC TT AACGACGATA TCGATTGATG CATCCTTT 
CGTTAAGCCG CTTCAATTAC TTTTTATTAT TCATAATG 
TTGTTTACTA TCTATGACTT CCCCCATATG. TTTGAATT 
CTACCATCTT TC AG ACT AAT TTTAATCCTA CCATCTTT 
TTAGATGCTT TCTCAACTTA AAATCTAACT CTTTCAAT 
ACCTTTCACA AAAGCATCTT CACCCTTTTC TTTTGTAA 
GTTTTATTGG TCTTTGTTAC ACCACCAACA GTATATGA 
AAAACACATC TAAGCCCGTA TGTCTTCGCT GTTGATAT 
AGTCTTCATC CCTCTATATT TAAAACGTAA GTTATTTT 
TCCTTAAAGT TTCCAGTGTA GTATCGGTGT AATGCTTC 
GTTTAGCATG ACCTGAATGG GCCTCTGTTG TTAAAGTT 
TTTTGCTAAT GCTGTCAATT TCATATTTAT TTACTCCA 
ATTTGAATCA TATACGCTTA TTATAACGTC CATCGTTT 
TGAATTATTA TTAATTAAAA ACAAATATAT TTTAATGT 
GTTGTTTTTC CGGATCATCG TTTTCTACAA CTAATTGT 
CTTAACAACA ATAATGTATT TAACACAGTT CTTATTCA 
AAGCCGCTTT TTAATTTTTT ATACTATCAT CTTTAATT 
TTACTATCCA GCACATCAAA CATACGATCA AAACTTAA 
CGTCTTTCAT ATTGATAACA ATTCTACCTT TATCTGAC 
CATTTTTCTG ATTTTAAAAT CAAGTTCTTT CAATGATA 
ATCGAAAAGC CATCTTTAGG ATCAACTTTT TTAACTTG 
ATCTCACATT TTTCTTTGTT ATACCGCCAA CGCTATAT 
TGCAAAGACA TCACGGCCAT GCGTCAGTTC TTTATATT 
AAATTTTTAT TTCGTTGGTT AAAGTTTAAA ACGTTAGA 
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129109 
129179 
129249 
129319 
129389 
129459 
129529 
129599 
129669 
129739 
129809 
129879 
129949 
130019 
130089 
130159 
130229 
130299 
130369 
130439 
130509 
130579 
130649 
130719 
130789 
130859 
130929 
130999 
131069 
131139 
131209 
131279 
131349 
131419 
131489 
131559 
131629 
131699 
131769 
131839 
131909 
131979 
132049 
132119 
132189 
132259 
132329 
132399 
132469 
132539 
132609 
132679 
132749 
132819 
132889 
132959 
133029 
133099 
133169 
133239 
133309 
133379 
133449 
133519 
133589 
133659 
133729 
133799 
133869 
133939 
134009 
134079 
134149 



CC CATAATAATC 
AG CATATTTATA 
CT GTCATCACAC 
TT ATTGCTCCAA 
GT GCATCGAAAC 
AA TATTCAAAGT 
TG TAAAAAACAG 
CG TTTCATCGTT 
AT AGTGACTGAT 
CA CCTAAATCAA 
TT TATATAATCC 
TC TTTTTGGATT 
GT GTATTAGTTT 
AT CAATTAATTC 
CC TAATAAGAAT 
CA ACCTTACCAC 
AT GTAAATGTTG 
GT AG TT AAT AAA 
CT ATTTTTAAAT 
TC ATTATATAAG 
AA TTAATGGAAT 
TC TTTCAATTTA 
AA CTATTTTCAC 
CA AAAAAGAAGC 
TT CGTTGAATTA 
TT TGTTTGTAAC 
TA GCTTTAGGAA 
TT TCAGCGAAAT 
TT TTTAATAACG 
CA TATGAATACA 
AC GTTTTTTAAT 
AT AAGGTAGTGC 
TA TAGTAATCTC 
TA CAGTTTGAGC 
GC TGTCATCTTC 
AC AGCTTTATTA 
TA ATTAAATTAA 
AT TGGTATTTTT 
AT ACCACTGTGC 
TT CAACGGTTCT 
TC TATGACATCT 
TC GTTTTAATAA 
CA ATTTAAAATC 
AT TTTGCCTTTC 
TC GTGATACCCC 
AT TATCGTATGG 
TC TGGTATAACA 
CT AAACTCGGTT 
TG TGGTTGGCGA 
TG TGGCGATTGC 
GT TGTACATTTG 
TG CTGATGCTAC 
GC TAAACTGGTT 
TT TACGTAATTA 
AC TATATTTGAA 
GC TTGCACATTG 
CC CTTAACATCA 
TT CATCGTTCAA 
TG TACCATCAAT 
CC ATTTTTCATT 
GT TTTCTCAATT 
AT GTGAGATTGT 
TT TGCTTTTGTG 
CA TCGACATGCC 
GA AATAATCTGG 
TT AAATTCTAAA 
CT TGTTTTGTGG 
TG TTGGTTGTGT 
GC TTTTTGTATG 
TA GACTTTGCCG 
GA TTGCGCTTGT 
GC CGATTGCGTT 
TT CTTATTTTCA 



TTCACTTTTA 
TAATATTTGC 
CTGTTGTTAA 
TCTTAATGTA 
AAACTGACAT 
ACGTGGTTTT 
CCTAACGACG 
AAGCTGCCTC 
ATACCTCTAA 
TTTCTACTTT 
GTAATTATTA 
AAAAATGAGT 
CAGAAGTTTT 
TTGTACTACA 
AATT GGTGAT 
TAACATTACT 
TACTCTCTCT 
CCTAATGCTA 
TTTGACTTAA 
CCGGATACTG 
CCTAACAGTA 
ATTATGTTTT 
AGTTTTTATT 
TAAGCAACCT 
TCTAATATTG 
TTTTTATTGA 
ACTTTTTATA 
CTCTTCCTTA 
AATCTAGGTG 
CTGTGCCGTT 
CCTTTCTATA 
TTGCCACCTT 
TCAAGTCGAA 
CGTTGATGTT 
ATAGTTGTAT 
TAGTTGGCGT 
GGAATCCATA 
AGACACATTT 
AACACGAAAG 
ATGCAGTATC 
GCCATGCGAT 
CAATTGTTCC 
AAGCTCTTTC 
TCATCTTTTT 
CGACAGAATA 
CCCATCTTTA 
TTCATAAACC 
TCGTATAATA 
TTGCGGTTGT 
GGTTGTACAT 
GTGAAGGCGA 
TTCTGCATTA 
TTCGCAATTT 
GAATCATACA 
TCATCGTTAA 
GGCATTGCAT 
TTACGCAAAA 
CGGTTCTATG 
AACATCTGCC 
TT AAT AAC GA 
TAAAATCAAG 
ACCTTTTTTA 
ATACCACCGA 
TATGCACGCC 
AAC AAT AT TC 
CTTGATTTCG 
TTGGCGATGT 
TGATTGCGGT 
TTAGATTTTG 
CCTGTTCAGG 
TGATACGTTT 
GTTACTATAC 
TAGTTGTATG 



ACACTAATGT 
GTTGTGTTTC 
TATTCCTAAT 
TTGGATTGTT 
TAACGTGCGT 
AATGAGTGGT 
ACGTTGCTTT 
AAAATTTATT 
TGTCTTTACT 
ATTTTCGTCT 
ACTAATTGTT 
CAATTGATGC 
GTTGTTTTTC 
AAGACATTTT 
TTTGATCTTT 
ATATTCGAAA 
TGTTTTTCTG 
ATGTTGCTTT 
TATATGTTTA 
CGCTATGAAT 
TTAATCATTT 
CCACATTGTT 
GTCGTTATTA 
ATGCTGCCTA 
GCTTCTATTT 
GTTCAAACGT 
CAGATCAAAA 
TAAATGTAGT 
CGCTTAAGTA 
GCGATTTTCC 
TCTTTACCAA 
TGCTATAACG 
GATATCTTTT 
ATAACCCCAG 
GCTCCAATCG 
ATGTCTCTAT 
ATGTTCGTTA 
AGGACATAAG 
AAGCTGAGCA 
CATCACGATG 
GTTCTTGTAA 
TGAACCAATA 
AAGGAAATCT 
TAGTAATGCT 
TTTTTTCAAT 
TATTTCTTAT 
TAACTGTCGT 
TGTTCTTAAA 
TTTGTTTCGA 
TTGGTGAAGG 
TGTTGTTACT 
GCCGATTGCG 
GTGTCATTTT 
ACTACATTAT 
GTAATCCATA 
CATTTTCAAG 
CAAAAGAAGC 
TAGTATCCAT 
ATGCGATGTT 
TTGTTCCTGA 
TTCTTTCAAG 
TCTTCTTTAG 
CCGAGTATTT 
TTC AC C AT AT 
ATAAATCTTA 
TATAATATGA 
TGGTTGTGGT 
TGTGCATTTG 
GCGTCTGTTC 
ATTAGCCGTA 
ATCTTTGCTA 
TTGCACCTGT 
CTCCAATCGT 



TTGTTGATTC 
ATCCAACTTT 
GCTAATGTTG 
ATGATTACGT 
TTAAATAACT 
ATGACTTATT 
AAATACGCTC 
ACTTTAAAGT 
ATT C AAC AC A 
TTCAAATTGA 
GTCTAATTTT 
AT C TAAATCT 
TTCGTTACAC 
GGCCTTGTAG 
TGGGTTAAAG 
CTTTCTGATG 
CCGCTTGAAC 
AGCTAACGTT 
TGTGATTAAT 
TAACCGCTTT 
TTTATAACTC 
TCATGTCACT 
TCTTTCATCG 
GCTCATCTAT 
TCTCAATATT 
ATAATAGCCG 
TCTTGAATTA 
ATCTTTTAAC 
AT C AT AGTAA 
GCTTCTTTAA 
AAATTTGTAT 
ATAGCCAGTA 
GTCACATTTT 
TC3CTAAAAT 
TAATTATTAG 
TCATATTAAA 
AATAAAAATG 
CATGTTATGT 
ACATGTAAAT 
ATTTTAGATT 
TTTTTTATGC 
TTACCGTACA 
CTTCTTTAGT 
TAATTCTGCT 
TGATATTTAT 
CATCTTTTCC 
CCATGGTTTC 
TCCTTATATT 
CATTAGATGA 
CGGAGTTGTT 
TTTTGTATTT 
TTGT TACT AT 
CATAGTTGTA 
AGAAGACGTA 
ATTTTCGTTA 
ACATTGAGAT 
TAAGCAATGA 
CACGATGATT 
GTTGCAATTT 
ACCAATATTA 
GAAATCTCTT 
TAATACTTAT 
GTCTACGCCA 
TTCTTATCAT 
TTGTTGTCCA 
TCTTAAACCT 
GTTTCGACAT 
GTAAAGACTG 
TGTTTGAGTT 
TGATTTTGTT 
ACTGTCCTGC 
TGTTACAAGC 
AATTATTCGA 



ATAGCTTTCT 
ACTTTCGCGT 
CTTTAGCTAT 
AATTTGAAAC 
TTGAATCATT 
TAGTATACTC 
CTTTGATAGG 
CATTGATTGC 
TCGCCCATGC 
TAATGATTTT 
GAAATCAAGC 
TCACCATTAA 
CACCAACAGT 
ACCTTCTTTA 
CGTACAACGT 
AGTAGTATCG 
TGCTTGACCT 
TTTAATTTCA 
TCAATTACGT 
TTAACTATTC 
TATGCTATTT 
AGTGCGACAC 
ACCCACTTCA 
ACT ATC C AT A 
TCTGCCATCT 
CCATCTTTCA 
AATACTGTCG 
ATGCACAGAA 
GCGCCTTGAT 
C T AC AAAT AT 
TCTTGTGAAC 
ACATTTTTAA 
CATATTTTGA 
ACTTAGAGCT 
ATTTGTTCGA 
CCTTGTTTAA 
ACTTTTGTTG 
GTGGTCAACA 
TTTTGTCACT 
CACTTCAATT 
AATTCAAACG 
AATTATGTTG 
TATCTTATAT 
TTGTGATCAA 
TGTCCTCTAA 
AACTAAAGCT 
AGCATAAAGC 
TAGGGTTTAT 
ATGCGGTGTT 
ACACTTGGTG 
TAGATTTTAT 
ACTTGCACCT 
TGCTCCAATC 
TCGGTCTATT 
AAT AC AG AT G 
GTCACTTTTA 
TGTAGGTCAT 
TTAGATTCAC 
TTTGTGTAAT 
CCGTACAAAT 
CTTTAGTAAT 
TCCAGTTTTG 
TATTTATTTT 
CTTTTCCAAC 
TTTTTTGATA 
TTAAATTTAG 
TAGATGAAGG 
AGTTGTTCCG 
GAAAGCGATG 
CTGCGTTCAC 
TTTAAGTGCT 
CCTAACGCTA 
TTGTTCTTTA 



TGAGAATAGT 
TTACAGTTTG 
TGCTGTTAAT 
ATCTGCCTTT 
GTTAATGAAC 
GTGACTTGTG 
ATATACATAT 
TTACTTTAAA 
GCTCGAATTG 
ACCGTATTTA 
TCTTTTAATG 
CTTTATTAAC 
AGATAGTCTG 
TATTGTTCTT 
TAGAACCATT 
ATGTAAATCT 
TCTGATGTAA 
TAGTATTATT 
AATACGAACA 
CTGAACCAAT 
TTTTCACCAT 
CGTGCGACTA 
TTTCCGATCG 
TTTTACTATT 
ATGACGTCAC 
TTGTCACTTT 
TAATTTAAAA 
ACACCTGCAC 
TTTTCTTAGT 
ATCTAATCCT 
TTTCTATTTT 
GTTCCTTACT 
TTCATGTTCG 
AAACTGGCTT 
TTACGTTTAT 
CTATATTTGA 
TACTTCAACG 
TCAAATATTC 
CAACTTCTCG 
CT TTC GAT AC 
TGTACTTCCC 
TTCAATGAGT 
TCTGAATCAT 
CTTTTTTACT 
AACGATAAAT 
ATTTTATAAA 
CAAATTGCTT 
TTCCTTTTGT 
GTTGGGTTTG 
TTGGTTGTGG 
TGCCTGTTCT 
GTTGTTAAAA 
TATTATATTC 
CACATTAAAC 
GTTTCGAAGC 
CAAACAAGTG 
TGTCGCTTAA 
TTCAATTCTA 
TCAAACGTGT 
TATGTTGTTC 
CTTGTATTCT 
TAATCAACTT 
GTTCTAAAAC 
TAAAGCTATT 
ATTATACCCA 
GGTTTATTTC 
CGGTGTTGTT 
CTTGGTGCTG 
TT GAT AC ATT 
TGCTTGCGCT 
GATGCTACTT 
TACTCGCTTT 
CGTAATTAGA 



AATCTTTT 
ACTTTCTG 
TTCATAAT 
ATTATAAA 
ACCTAATT 
TT ATC AAA 
TTATTCAA 
TTTGGTTA 
TAATTTAT 
GATGTACC 
AGATTTCT 
AAATAAAG 
CCGTTTGG 
TATCTTTT 
GTAGTTTT 
CT AAT ATC 
TGACACCA 
CTCCCAAT 
ATCTACTA 
GTTAAGCT 
TCACACTT 
TAGGTATC 
TAACGACA 
ATCCATGT 
TCATTCGG 
TATCTTGC 
TCAAGTTC 
CTACTTCT 
AACGCCAC 
GGATTTTT 
TATCAAAG 
TGCGCGAC 
CTCGCATT 
TCGCAATT 
TGAATCAT 
ATCATCGT 
CTTGATAG 
GTCTTCTT 
GTACATCG 
TAGTGCCA 
ACCGTTTT 
TGCTTTCT 
CGTGTGAG 
ATTTGTCT 
ACATCAAT 
TGAACCAA 
TTCGAATT 
GCTTGTTT 
GTGTTGGT 
CGATTGCG 
GTTTTAAG 
GTGATAGT 
GATTGTTC 
CATGTTTA 
G ATT C AAC 
AATAATTT 
CTTCTCAT 
TC AAT ATT 
ACTTTCCA 
AATGAGTT 
GAAACATC 
TCTTTCTA 
GATAAATA 
TTATATAT 
ACTCATTT 
CTTTTGTG 
ACACTTGG 
GTTGTGTT 
TTCTGTGT 
GTCGTTGT 
CTGCATTG 
CGCAATTG 
ATCATACA 
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134219 
134289 
134359 
134429 
134499 
134569 
134639 
134709 
134779 
134849 
134919 
134989 
135059 
135129 
135199 
135269 
135339 
135409 
135479 
135549 
135619 
135689 
135759 
135829 
135899 
135969 
136039 
136109 
136179 
136249 
136319 
136389 
136459 
136529 
136599 



AC TT C AT TAT AG 
GT CATTTAAGCA 
TT AAGACATAAA 
TA CGACGACCAG 
CG ATATCGGCCA 
TT GGACAGTAAC 
TT AAAATCAACC 
CA CCAGTGTGAT 
GA TACCACCAAT 
CG ATGTACACCC 
CT ATCATATTCA 
TT CAACACTAGG 
TC GGCATGCACC 
TT GCAATTGATT 
GA ATCATACAAC 
AT AT GATTAAGC 
CT CAAAAAAGAA 
AT AGCGCTTTGT 
GA ATCACCCATA 
TA ATTGTAATTT 
TC TATAATCAAG 
TT ACCTTTCTTA 
CG CCACCAATCG 
GT TAACTTCTTC 
AT AAAATCAATG 
GT CCATTATAAT 
TA TATTCGACGT 
TT CATAATTTTT 
TT ATATTGAGGC 
TA ATAAAATGCT 
AA CCATTTTAAT 
TC AAAATTTTCA 
AA CATATGCGTA 
TG AAAATCATCG 
AA CAGGTGCTAC 



AGGACATATT 
AAAAATAATG 
AAAGAAGTTA 
TCTAAGCATT 
TTCGGTCTTC 
TGTTCCTGAA 
TCTTTTAATG 
CTTCTTTAAT 
CGAGTACTTT 
GTATGATAGT 
AAAATTTAGT 
TTGAGTATAG 
GGTTGTTCCG 
TCATTTTCAT 
CATATTATAG 
TAGTTAAGTA 
GCTAAGCAAC 
CGTTAGAAAG 
CGTTCTTTTT 
GACCTTGTTT 
TTCTTTTAAT 
ATTTCAAGGA 
AATGATTATC 
ATTATATTTA 
AATCTATCTT 
AAAGTTTTAA 
TATTACACCT 
AGCCCCAATC 
GCTATTCCCT 
AACTGTTAAT 
TTTAAAAACT 
AAGATCATGG 
TTTACTATAC 
AATAAGTATT 
AGGTCATTTA 



GGTCTATTCA 
AGTACTTTCG 
AGTAACGTTT 
ATGCTTTTTT 
TTGTAACTTT 
CTCATTTCAC 
TCATTGCTTC 
GACTCTTGCT 
GCACCTTCAA 
GTTCCTTATC 
GCCTGGTTTA 
TACGCTTTTA 
TTACAGTATT 
AATTGTATGC 
GAGCTGTATG 
AAAGATAATG 
AGTACGTCGC 
TATTATTTCA 
CAAGTTTTTG 
AAGAC CAT TT 
GAGATATTTT 
TTGGAGAGTG 
AGCTTGTCTA 
TCTTTATCAG 
TACCTTCAAT 
ATCAGCCTCA 
GTTGCTAACA 
TATTTAAATT 
TTCAAATTAA 
AATCAATTCT 
GTAAAACTTT 
TGACACGCAT 
ACGGCGACAA 
ATGTTATGTA 
GGCACACATA 



CATTAACCCT 
AGGTTTATGT 
TATCATCACT 
AACTTTGACT 
TTATCCAATT 
C AT AAAGAC C 
TTTATCTATT 
TCTGCTACTT 
ATCTCTTATC 
TTTGCCAACT 
GGTTGAATAC 
AATTTTTACT 
TACACCTGTA 
TCCAATATAT 
CTGATATTCA 
CGTGCTGCGA 
TTAACTTCCA 
TTTCTACTAG 
ACTTAAATCG 
GAATACAATC 
CTTTATAAAT 
TATATAGTCG 
CCGTTTCCTT 
AACCAACTAA 
AAAT C CAT AT 
CTTTGTTGTT 
TTCCCAATGC 
TGATTTTTCA 
ACCTCATTTA 
TATCCGTCAA 
CAGACACAAA 
CAACTAAAAC 
TAAATACAGA 
TATAAAGATT 
TTACAAATCA 



CGTTTAACAA 
GGTGTTACTA 
TAACCTCACT 
TCAATCCTAT 
CAATTGTATA 
ATAATGATCA 
TTAAATGGGA 
GGTCGACAAC 
CTCATTAACA 
AAAGC TAGAT 
TACTGATATA 
ATTTTGGCTT 
GCTAAAATAC 
TAGAT TAGAT 
CATTAACCCT 
CTGTTTGTGT 
TATATTTCCA 
AATTTTTTGT 
ATAGTATGTG 
CGTGTTGTTT 
TTGATACAGA 
TAATACACTC 
CTCTAACGAC 
AGATATTTTA 
CCAGTTACTT 
TAACTTCTGT 
TAAACTTGCT 
ATTACGTAAT 
ACTAAACTTG 
GACTCTTTTT 
AATGACATCA 
GCGTCATTAC 
TTGCAATTTT 
AGGAGATGAA 
AGCCATTGCA 



TTGTTGAAAT 
TGTATCTTTG 
ATTGCAATTT 
CAATATCTAT 
CTTGCCATAG 
ATAAGATGTT 
AAAAGTCATA 
TTTATCGCTT 
TAAAATATAT 
TATTAACTGT 
ACCTGTCACA 
ACTTGAACAT 
CCAACACTAA 
TGTTTTATTA 
TTTTTAACTA 
GAGTTTTGTA 
TAACAAGCAC 
ATTTGTCTGC 
ATTTGCCATC 
GATTGCACGT 
CTATTTTGTG 
CTCTATTAGT 
AAACACATCT 
TTATATTGTC 
TTTTATATTC 
TTTCGCTTGT 
TTTGCTATCG 
ATGAACAATC 
AATCATTGTT 
TATACAATTC 
TAT C GT C GC A 
GGTATTTTTA 
TTATAAAATC 
GTGAATGAAT 



ATTATTAA 
AAATGATT 
CAATTCAT 
GACTTTGA 
TATTTCAT 
TTCTTATT 
ATCATATT 
GCTTTTGT 
CAAGGTTG 
ATTACCTT 
TTTTTATA 
GTTTATTT 
ACTTACTT 
CGTAAATT 
TTCATAAA 
ACTTTGCC 
GATTCAAC 
CGTCGATA 
TTTCATAG 
TCTCGTAA 
GTTCTTCA 
TTTTGTTA 
ATATCTGG 
CATTGTAT 
AAAACTTG 
ACTGATTG 
CTTTAAAT 
TACCTACA 
AAGGTAAT 
ACACTTTA 
GTACCATG 
ACATTAAT 
AAATGTTT 
ATTATGTT 



SEQ ID No: 54 - S5L1 

135433. .136128 
/gene="EMRSA 16 (252) ssll" 
complement 135433 .. 136128 
/gene-"EMRSA 16 (252) ssll 

/product^" (similar to SET6 from strain Mu50)SSLl" 
/trans la tion="MGLKIMKFKAIAKASLALGMLATGVITSN^ 

GYGFIEGKDRFIDFIYNGQYNKISLVGSDKDKYNEEVNPDIDVFWREGNGRQADNHSIGGVTKTNRGVYYDYIHS 
PILEIKKGNEEPQNSLYQIYKENISLKELDYRLRERAIKQHGLYSNGLKQGQITITMKDGKSHTIDLSQKLEKERM 
GDSIDGRQIQKILVEMK" 



gene 

CDS 



SEQ ID No: 55 - SSL2 

134449. .135150 
/gene- n EMRSA 16 (252) ss!2" 
complement 134449 135150 
/gene="EMRSA 16 (252) ssl2 

/product=" (similar to SET7 from strain N315)SSL2" 
/ t r ans 1 a t i on= "MKMKS IAKVS LVLGI LATGV^^ 

SSI QPKPGTKFLNMIEGNT VNNLALVGKDKEHYHTGVHRNLDI FYVNE DKRFEGAKYS I GG ITKAS DKWDQVAE A 
RVIKEDHTGEYDYDFFPFKIDKEAMTLKEVDFKIRKHLIDHYGLYGEMSSGTVTVQMKYYGKYTIELDKKLQEDRM 
ADI VKVI DI DRIE VKVKKA » 



gene 

CDS 



SEQ ID No: 56 - SSL3 

gene 133130 134161 

/gene="EMRSA 16 (252) ssI3" 
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CDS complement 133130. . 134161 

/gene-"EMRSA 16(252) ssl3 

/product^" (similar to SET 9 from strain N315)SSL3" 

/translation— "MKIRTIAKAS IALGLVTTGASIVTTQSANAEVASALBCAGQLAKINVSTSAITTTAQAVNAEQ 
NHTANPEQAAKSNTENVSTSLSTQTEQTPKSNIQKATQPAPSGTTQSLPNAQPQSTQPTPSVTTPPSSNVETPQPT 
SPTTKQAQKEINPKFKGLRSYYTKSSLEFKNELGIIIKKTO^^ 

VFIVLEQNKYGVDKYSVGGITKANRKICVDYKTGISITKEDKKGTISHDVSEYKITKEEISLKELDFKLRKQLIEQH 
NLYGNIGSGTI VIKMKNGGKYTFELHKKLQQHRMADVIDGTNIDRIEVNLKSS " 



SEQ ID No: 57 - SSL4 

gene ~ 131863 . . 132783 

/gene="EMRSA 16(252)ssI4 M 
15 CDS complement 1318 63. .132783 

/gene="EMRSA 16 (252) ssl4 

/product-" (similar to SET8 from strain Mu50)SSL4" 



/translation^"MKMTQIAKTSIALSLLTTGASIVTTQSANAEVASALKTEQAIKSKIQKVTTSPSPNVQPQS^ 
QPTPSVTTPPSPNVQPQSPQPTPNPTTPHSSNVETKQPQSPTTKQAQKEINPKYKDLRTYYTKPSLEFEKQFGFML 
KPWTTVRFMIWIPDWFIYKIALVGKDDKKYKDGPYDNIDVF^ 

KKDEKGKI SHDDSE YKI TKEE I S LKELDFKLRKQLI EQHNLYGNI G S GT I VI KTKNGGKYT FELHKKLQEHRMADV 

IDGTSIERIE 

VNLKSS" 

SEQ ID No: 58 - SSL5 



gene 130798 .. 131502 

30 /gene="EMRSA 16 (252 ) sslS" 

CDS complement 1307 98 .. 131502 

/gene="EMRSA 16 (252) ssl5 
/product^" (similar to SET3)SSL5" 

35 /trans la tion="MKMTAIAKASIJUjSILATGVIT 

GYRYSKGGKHYLIFDKNRKFTRIQIFGKDIERIKKRKNPGLDIFWKEAENRNGTVYSYGGVTKICNQGAYYDYLSA 

PRFVIKKEVGAGVSVHVKRYYIYPCEEISLKELDFKLRQYLIQDFDLYKKFPKASKIKVTMKDGGYYTFELNKKLQT 
NRMSDVIDGRNIEKIEANIR" 

40 SEQ ID No; 59 - SSL7 

gene 129656 .. 130351 

/gene^"EMRSA 16 (252) ssl7" 
CDS complement 129656. . 130351 

45 /gene="EMRSA 16(252)ssI7 

/product=" (similar to SET1)SSI>7" 

/trans 1 at ion="MKLKTLAKATLALGLLTTGVITSEGQAVQAAEKQERVQHLHDIRDLHRYYSSESFEYSNVSG 
KVENYNGSNVVRFNPKDQNHQLFLLGKDKEQYKEGLQGQN^ 

NKVNGEDLDASIDSFLIQKEEISLKELDFKIRQQLVNNYGLYKGTSKYGKIIINLKDENBCVEIDLGDKLQFERMGD 
50 VLNSKDIRGISVTINQI" 

SEQ ID No; 60 - SSL9 

£ene 128 617 129315 

55 /gene="EMRSA 16 (252 ) ssl9" 

CDS complement 128 617 .. 129315 

/gene="EMRSA 16(252)ssI9 
/product^" (similar to SET5) SSL9" 
/translation-"MKLTAIAKATIALGILTTGVMTAESQTVNAKVKLDETQRKYYINMLKDYYSQESYESTNISV 
KSEDYYGSNVLNFNQRNKNFKVFLIGDDRNKYKELTHGRDVFAVPELIDTKGGIYSVGGITKKNVRSVFGYVSHPG 
LQVKKVDPKDGFS IKELFFI QKEE VS LKELDFKI RKMLVEKYRL YKGAS DKGRI VINMKDEKKHE I DLSEKL S FDR 
MFDVLDSKQIKNIEVNLN" 



65 



SEQ ID No: 61 - SSL10 
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gene 127571 .. 128254 

/gene="EMRSA 16(252)3522 0" 
CDS complement 127571 128254 

/gene="EMRSA 16(252)ss210 
/product^" (similar to SET4)SSL10" 
/translation="l^LTALAKVTLALGILTTGTLTTEAHSGHAKQNQKSVNKHDKEALHRyYTGWFKEMKNIN 
ALRHGKNNLRFKYRGMKTQVLLPGDEYRKYQQRRHTGLDVFFVQERRDKHDISYTVGGVTKTNKTSGFVSTPRLNV 
TBCEKGEDAFVKGYPYDIKKEEISLKELDFKLRKHLIEKYGLYKTLSKDGRIKISLKDGSFYNLDLRTKLKFKHMGE 
VIDSKQIKDIEVNLK" 

SEQ ID No: 62 - SSL11 

gene 123447 124145 

/gene="EMRSA 16(252)55111" 
CDS complement 123447 .. 124145 

/gene="EMRSA 16 (252) ssl!2 

/product^" (similar to SET15 from strain Mu50)SSLll" 
/translation="MKLKNIAKASLALGILTTGMITTTAQPVfCASEQSRLSVTSNDTQELKKYYSGTGYNFQNV 
SGYREKDKMNI I DGTQLNVVTLLGTDKERFKDYDYDYEGLDVFVVREGSGKQAENI S IGGITKTNPQSJDYKDFVNNV 
GLEITKPTGHNTATRQAETYRINKEEISLKELDFKLRKBLIENHELYKTEPKDGKIRITMKGGGYYTFELNKKLQP 
HRMGDVIDGRNIEKIEVDLY" 

SEQ ID No: 63 



38500 
38570 
38640 
38710 
38780 
38850 
38920 
38990 
39060 
39130 
39200 
39270 
39340 
39410 
39480 
39550 
39620 
39690 
39760 
39830 
39900 
39970 
40040 
40110 
40180 
40250 
40320 
40390 
40460 
40530 
40600 
40670 
40740 
40810 
40880 
40950 
41020 
41090 
41160 
41230 
41300 
41370 



A CAATGTAAGT 
G GGGCGGGCCC 
T ACTAAATTAA 
G TAT TT AC GAT 
T TAACATAAAC 
C GGTGATTTTA 
T GCTCGAAAAT 
A TTGTTTGTAG 
T TACACCACCT 
C ACGTCTACAT 
G CTGACCAAGA 
A AAATGTGAGA 
A TAAGCTTTAG 
A ACAATGTTGT 
T CATCAATACG 
A CTTCGATTTC 
T ATCTATTTGC 
T CTTAATGTGG 
G AAATCAACTT 
C CTGCTTTATC 
C GCCAGTAAAT 
T ACATATTTAT 
C ATGATAACTG 
T TAAATGGCGT 
T TTTGCTTCAG 
G CTATTGATAA 
A TGCGAGGAAT 
A T GAAT GAT AC 
G TTTGTGTCAG 
A TTTGACCTTT 
C TAGTTCTTTT 
T TTATCCTTAT 
G TAAAGGTTTC 
A TTTATCTCTT 
C CACGTTTGTA 
T GTGGCTGCGT 
C TTCAGATGAA 
T GATAAAACAA 
T ATAAGGTATT 
C GTGTGATTTA 
T AATAAATTAA 
T ACT T G AAAAA 



TGGGGTGGGG 
AAACCCAGAG 
TGATTGTGTA 
ATAAATCATA 
ATTTGCAAGA 
AGTTTACCTT 
CAATTTCTTT 
TTTATCTTTG 
GTAAATGTAT 
AGCGGTCTCT 
ATAACGCTTA 
GGCAGTTGTG 
CGTCTGTAGA 
CATTAAAATT 
AGAATTAACA 
GATGTTAGCT 
ATCAGTTGAT 
CCTTTATTAT 
CCTTTAAAGT 
TTTATATGTT 
GTTTCTTGGG 
TTCTTAATTG 
TAACAATGTT 
TGTGTGTAAT 
ATGACATACT 
AATAATTTTT 
AATAAGTTTA 
TAATGAGATT 
TTAGTTTTAA 
ATTAAATTCT 
AAAGTTATTA 
ATTTTAAAAA 
ATCATAAGTC 
AATTGATTAA 
ATGTTGCATG 
GTAATATTTG 
AAACTGATTG 
TTTTGTTCGT 
GCACCGATTC 
AGTAAAATTT 
AACTGATGCT 
TAGTTATACT 



CCCCAACATA 
AATGACTTGG 
AGTACTATTG 
CGACCTTATT 
TCAGAATGTA 
TATTGCGATT 
C AAT GTT AAT 
TATTGGACAA 
CACGATATAC 
TAATTGATTA 
AATTCAACAT 
TGTAATATTT 
AACATTTAAT 
AACTTTTTCA 
AATTTAAAAA 
AGATTTTTCA 
TTTAATCTTT 
AATCACCATT 
TAAAACTGGT 
ACATTCAATT 
CTAACATATA 
ATTAATATTA 
GCATGGATTG 
ATTTAGTCAA 
AATCGGTGAA 
TTTGTGAGGT 
TAAAAAGAGA 
TTTCAAGAAT 
CTTTTTACTC 
CC AT TAT AT A 
CTGGTTTATT 
TAATTTATAA 
CAATAGCCTT 
TGTTGCCCCA 
TATGGTACCG 
ATTAATTCAT 
GTGCTTTAGG 
GATGTTCTTT 
GGAATTAAAT 
TTCATTTTTA 
AATTATGATA 
TTAAATGTAG 



GAGAAATTGG 
CATTTCTATT 
AGACATTTTG 
TAATCAATGA 
ATCTTTTACT 
TTCATGATAT 
ATTGGTTTAT 
AAAGACTATA 
CCACCATAAC 
ACGTTGCCCC 
AAATGTTGCC 
AATTAACTCA 
GGC GAAT TAG 
TAATGCACTT 
TGAACGGCGT 
AGGAGGATTT 
TGCTTAAATC 
ATATAATTTT 
TTATTACCTG 
TATAATGTTT 
AGTGCGCCAG 
CCCCAACTTT 
TACCATTATC 
TTCATTTACA 
TGAGGGGATT 
TGTTTTTCAT 
GAAATCAATT 
GACTTCAATT 
AAATCAATCG 
GCTTTTTGTT 
ACCTTGATAA 
GGCTTATCAG 
CTATAGTATC 
ACTTTCAGGA 
TTAGGGTTTT 
TGATATTCGT 
CGGTTGTGTT 
TTCATATGAT 
AAAAGCTAAC 
AGCATTGGAA 
TTAATTGCGT 



GTTACCAATT 
AGACTCCAAA 
TAGTATTACT 
GGACTTCGAC 
CAAATCAATT 
AATATTTTAT 
TACCTTGGTA 
TGGTTTATCA 
TGGCTAGTGT 
AACTTTCAGC 
GTCATCATAT 
TTGATGTTCG 
AGGTTTGTGT 
TCTCATTCGA 
CAATGTAAAC 
CGATGTGTGC 
TATTGTGTAG 
TTGTTTTTAA 
TATAAAATTG 
ATCGCTAGGT 
GTTTCATAAT 
CAGGGCCGAA 
TTTTTGCCAT 
TTCGTTTCGC 
GACTTGTAGC 
TAGAAAATCT 
AAATCATTCA 
TCTGCATGTC 
TATAATGATT 
CTTAATTAAA 
AACTCAAGTC 
ATGAAGTAGC 
CTCATCCTTG 
CCAAATACTT 
GCCATAGCCA 
TTCGTTTTCA 
GATGTAGTTC 
AATCTCCTTT 
ACTATGTTAA 
CATTGCCAAT 
TAAATTAGTT 



TCTACAAGCA 
TCGCATTTAA 
CAAATATTGT 
TGTTATTTTT 
GT AAAGT CAT 
TTTTAATTAG 
CATCTTATGT 
GAAGGTGATG 
CTTCGTCTTT 
ACCCCATAAT 
TGGTATAACC 
TTTCGTCTTG 
AGCTGTTGTT 
ACTTCTCCTT 
ATTGTGCTTA 
AGTTTGAGGG 
TTGTTGCTGC 
TTAGTGTTTG 
ATATTCGCCA 
TTTGCTACTG 
CTTTTAAAGC 
TACTTGAATA 
AGCCATTTAT 
TTTCACTTAT 
TGTGCCTATC 
CCTTTACGAG 
GTCTATTTGT 
GAGGATTTTT 
ACCACCACCT 
GTTTGACGTA 
CACCAATCAT 
TGCTGGTGTT 
GTACCAAAAA 
GAATATGCGT 
TTTTCCAGAA 
CTTATATTGT 
CTAAAAGTAA 
GCGTGAATTA 
ATAAACATAA 
TATAAAATGA 
ATTTATTGAA 



ATGCAAGTT 
TATTCTGTG 
TTATCCTTA 
TGAGGATTT 
TGCCGCCAC 
TGTTTCACG 
GCACCAATA 
CAGCAGGTG 
TAAACCAAA 
CGTATATGC 
ATCTATTTG 
GCCAATATG 
CCTAATAAT 
TCTGCGTAA 
TTTCTGTTT 
TTTTTAACA 
CACCTGTAA 
GCGGATTCG 
ATAAATGTT 
GTGTAACAC 
AAAAATATC 
TGACTAAAC 
TTGAAAATG 
ATTGTATGC 
ATAGTCAAT 
GAAT GTT AT 
ATTCACTCA 
AACATAACT 
GTAATCCTA 
CGCGGAAAT 
AGTTTGTTG 
ACACCTCCA 
TATCTACAT 
GTACCAAAC 
AATGAAAGG 
ATGCTTTTG 
CAATGCTGT 
CCCATAGTA 
ACAGTTGAA 
AAAGAGTGC 
TTTTCTAAC 
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/gene="EMRSA 1 6 (252 ) ssll2" 
CDS complement 40400.. 41125 

/gene="EMRSA 16 (252) ssl2 

/product=" (Similar to SAV1168 of strain Mu50) SSL12" 
/trans la tion="MKKNITNKIVLSTAL 

SGKWLWQNPNGTIHATLQTWVWYTHIQVFGPESWGNINQLRDKYVDIFGTKDEDTIEGYWTYDETFTGGVTPAATS 
SDKPYKLFLKYKDKQQTMIGGLEFYQGNKPVITLKELDFRVRQTLIBCNKKLYNGEFNKGQIRITGGGNHYTIDLSK 
KLKLTDTNSYVKNPRHAEIEVILEKSH" 



SEQ ID No: 65 - SSL13 

39565. .40290 
/gene="EMRSA 1 6 (252 ) ssll3" 
complement 39565.. 40290 
/gene="EMRSA 16(252) ssl3 

/product^" (Similar to SA1010 of strain N315)SSL13" 

/translation^ "MKNNLTKKIILSIALTMIGTATSQSPHSPISMSSEAKAYKISESETNVNELTKYYTQRHLTF 
SNKWLWQKDNGTIHATLLQLSWFSHIQVFGPESWGNINQLRNKYVDIFALKDYETWRTYMLAQETFTGGVTPVAKP 
SDKHYKLNVTYKDKAGTFIGEYQFYTGNKPVLTLKEVDFRIRQTLIKNKKLYNGDYNKGHIKITGGSNNYTIDLSK 
RLKSTDANRYVKNPQTAHIEILLEKSS" 



SEQ ID No: 66 - SSL14 

gene 38740.. 39456 

/gene="EMRSA 16 (252 ) ssll4" 
CDS complement 38740.. 39456 

/gene="EMRSA 16(252) ssl4 

/product^" (Similar to SAV1166 of strain Mu50)SSL14" 

/translation= ?, MRKCIMKKLILMTTLLLLGTTATQTSNSPLNVSTDAKAYHIGQDETNINELIKYYTQLPLTF 
SNRWLYQYDDGNIYVEFKRYSWSAHIRLWGAESWGNVNQLRDRYVDVFGLKDEDTSQLWWVYRDTFTGGVTPAASP 
SDKPYSLFVQYKDKLQTIIGAHKMYQGNKPILTLKEIDFRARETLIKNKILYHENRNKGKLKITGGGNDFTIDLSK 
RLHS DLANVYVKNPQKITVEVLID " 

S. aureus strain MSSA-476 taken from unpublished genome 
project at the Sanger centre via http://pedant.gsf.de 

SEQ ID No: 67 

110530 A TTTAAATAGA TTGGGGCTAA AAATTATGAA ATTTAAAGCG ATAGCAAAAG CAAGTTTAGC ATTGGGAAT 
110600 G TTAGCAACAG GTGTAATTAC ATCGAATGTA CAATCAGTAC AAGCGAAAAC AGAAGTTAAA CAACAAAGT 
110670 G AGGCTGATTT AAAACTTTAT TATAATGGAC CAAGTTTTGA ATATAAAAAA GTAACTGGAT ATGGATTTA 
110740 T TGAAGGTAAA GATAGATTTA TTGATTTTAT ATACAATGGA CAATATAATA AAATATCTTT AGTTGGTTC 
110810 T GATAAAGATA AATATAATGA AGAAGTTAAC CCAGATATAG ATGTGTTTGT CGTTAGAGAA GGAAACGGT 
110880 A GACAAGCTGA TAATCATTCG ATTGGTGGCA TAACAAAAAC TAATAGAGGA GTGTATTATG ACTATATAC 
110950 A CACACCAATC CTTGAAATCA AGAAAGGTAA AGAAGAACCA CAAAGTAGTC TATACCAAAT TTATAAAGA 
111020 A GACATCTCAC TAAAAGAACT TGATTTTAAA TTAAGAAAGC AATTAATTAG TCAAAGTGGC TTGTATTCA 
111090 A ATGGTCTTAA ACAAGGTCAA ATTACAATTA CAATGAATGA TGGCACAACA CATACAATCG ATTTAAGTC 
111160 A AAAACTTGAA AAAGAACGTA TGGGCGAGTC TATCGATGGC AGACAAATAC AAAAAATTCT AGTAGAAAT 
111230 G AAATAATACT TTCTAACAAC AAAGCGCTAT GTTGAATAGT GCTTGTTATG GAAATATATG GAAGTTAAG 
111300 C GACGTACTGT TGCTTAGCTT CTTTTTTTGA GGGGAAAAGT TACAAAACTC ACACAAACAG TCGCACCAC 
111370 G CATTATCTTT TGCTTAAATA GCTT AAT CAT ATTTTATGAA TAGTTAAAAA CAGGTTAATG TGAATATCT 
111440 G AATACAGCTC CTATAATATG GGTGTATGGT TCAAATTACG TAATAAAACA AT CT AAT TAT AATAGATTG 
111510 G AGCATACAGC TATGAAAATG AAATCAATTG TAAAAATAAG TTTGTTATTA GGAAT AT TAG CAACAGGTG 
111580 T AAACACTACA ACGGAAAAAC CAGTTCATGC CGAAAAGAAA CCTATTGTAA TAAGTGAAAA TAGCAAAAA 
111650 A TTAAAAGCTT ATTATACTCA ACCTAGTATT GAATATAAAA ATGTGACAGG TTATATCAGT TTCATTCAA 
111720 C CAAGTATTAA ATTTATGAAT ATCATAGATG GTAATTCTGT T AAT AAT AT T GCTTTAATTG GCAAAGATA 
111790 A GCAACATTAT CATACGGGTG TACATCGTAA TCTTAATATA TTTTACGTTA ATGAGGATAA GAGATTTGA 
1118 60 A GGTGCAAAGT ACTCCATTGG CGGTATCACG AGTGCAAACG ATAAAGCTGT CGACCTAATA GCAGAAGCA 
111930 A GAGTTATTAA AGCAGATCAT ATTGGTGAAT ATGATTATGA CTTTTTCCCA TTTAAAATAG ATAAAGAAG 
112000 C AATGTCATTG AAAGAGATTG ATTTTAAATT AAGAAAAT AC CTTATTGATA ATTATGGTCT TTACGGTGA 
112070 A ATGAGTACAG GGAAAATTAC CGTCAAAAAG AAATACTACG GAAAGTATAC ATTTGAATTG GATAAAAAG 
112140 T TACAAGAAGA CCGGATGTCC GATGTTATCA ATGTCACAGA TATTGATAGA ATTGAAATCA AAGTTAGAA 
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112210 A AGCATAACAC ACATACTTGA CGACGAAATA 
112280 G TTGCTTAACT TCTTTTTAAT GCTTAAAAAT 
112350 A TCACTCATTA TTTTTTGCTT AAATTACTTA 
112420 T ATCTTAGAAT GCCATCTATA ATGATGTTGT 
1124 90 A GATTGGAGCA TACAATTATG AAAATGAGAA 
112560 C AGGCGCAATT ACAGTAACGA CGCAATCGGT 

112 630 A CCAACGCTTA AAGCAGAGCG ATTAGCAATG 
112700 G CAGCTAACAC AAGACAAGAA CGCACGCCTA 
112770 C AGCTTCCAAA ATAGAAAAAA TATCACAACC 
112840 G CCAGCGCCTA AACAAGAACA ATCACAAACG 
112910 C CTCCATCAAC AAACACGCCA CAACCAATGC 
112980 A ACAAGCACAA ACAGATATGA CTCCTAAATA 
113050 A TTTGAAAAGC AGTTTGGATT TTTGCTCAAA 
113120 A GGTTCATCTA TAAAATAGCT TTAGTTGGAA 
113190 T CGATGTATTT ATCGTTTTAG AAGACAATAA 
113260 G ACTAATAGTA AAAAAGTTAA TCACAAAGTA 
113330 T CACGCGATGT TTCAGAATAC ATGATTACTA 
113400 G AAAACAACTT ATTGAAAAAC ATAATCTTTA 
113470 A AACGGTGGGA AATATACGTT TGAATTACAC 
113540 G GCACTAATAT TGATAACATT GAAGTGAATA 

113 610 G GAAAAACAAG AAGTTAAGTG ACAACGGTTT 
113680 A AAAGACGAAT ATTCATTTGT TTGTAAAAGT 
113750 G CCAAGTGTTG AATCACATCA AAATCATTTT 
113820 A TGATTCAAAT ATAGT TAAAC AAGGTTTAAT 
113890 A TTCAATGAAT GTAATCGAAC AAATCTAATA 
113960 T GCTAAAACAA GTTTAGCACT AGGCCTTTTA 

114 030 G CGACAACACC GCCTTCAACT AAAGTGGAAA 
114100 C TAAAGTGGAA GCACCGCAAC AAGCAGCAAA 
114170 A TCAAAACCAA ACGCGACAAC ACCATCTTCA 
114240 A CACCACCTTC GTCTAATGTA GACACATCAC 
114310 T AAATCCTAAA TTTAAAGATT TAAGAGCGTA 
114380 T ATTATTTTAA AAAAATGGAC GACAATAAGA 
114 450 G CTTTAGTTGG TAAAGATGAT AAAAAATATG 
114520 T AGAAGAAAAT AATTACAATC TCGAAAAATA 
114590 T GATCACAAAG CAGGAGTAAG AATTACTAAG 
114 660 T TCAAGATTAC TAAAGAACAG ATTTCCTTGA 
114730 A AAATAATCTG TACGGTAACG TTGGTTCAGG 
114 800 G TTTGAATTGC ACAAAAAATT ACAAGAAAAT 
114870 A TTGAAGTGAA TATAAAATAA TCATGACATT 
114 940 G TGACAACGGC CTACATGTTG CTTAGCTTCT 
115010 G GGTCCAAATA TGACGTGGAA GAGTCCTGAA 
115080 C GGAATCAGTT TTATTTAACG AACATTATAG 
115150 A CATGGTTTAA TGTGAAAGGT CAAATACGCC 
115220 A CAAATCTAAT AATTACGAAT GGAGCATACA 
115290 T TAGGTATTTT AGCAACAGGA ACAATAACGT 
115360 A ATATGAAAAT GTGACAAAAG ATATCTTTGA 
115430 A AATGTTACTG GTTATCGTTA TAGCAAAGGT 
115500 A CTAGAATACA AATTTTTGGT AAAGATATAG 
115570 T TGTTGTTAAA GAAG C GG AAA ACCGTAATGG 
115640 A GACGCTTATT AT GAT TAT AT AAACGCACCA 
115710 A CGTACGGTAG AGTACACTAC ATTTATAAAG 
11578 0 A GTATTTAATT CAAAATTTTG ATCTGTATAA 
115850 A GATGGCGGCT ATTATACGTT TGAACTTAAT 
115920 G GTAGAAATAT TGAAAAAATA GAAGCCAACA 
115990 G AT AG TAT AG A GGAGTTAGGC AACATAAGTT 
116060 C GTATCGATGA ATAATAAAAA CACCAATAAA 
116130 C TTTTCGTGAC ATGAAACAAT GTGGAAAACA 
116200 G CGTTAAGTTT AAAAAATAGA TTAACGCTGT 
116270 G T T AAAAAGAG GTTAATTCAT AGCTTAGTAT 
116340 A TTGAAATAAT CATATAAAAA TATATTAAGA 
116410 T TAAAAGCGTT AGCTAAAGCA ACATTAGTAT 
116480 A A AC AG T AAAA GCGGCAGAAT CAACTCAAGG 
116550 G CCAAGTATAG AGTTAATAAA TGTAGATGGT 
116620 T GGAAAAATCT TAAAGATTAT TATATTGGGC 
116690 A CGGGGACCTA GATGCATTTT TAGTCATAGA 
116760 T ATAAGTAAGA CAAATAGTAA AGAATTTAAA 
116830 A GAGATACTAC ATCAACTAAA GATAGTAAAT 
116900 A TTTTAAATTA AGACAAAAAT TGATGAAAGA 
116970 A ATTGTAGTTA AAATGGAAGA TGATAAGTTT 
117040 A TGGGTGACAC GATAGATGGT ACCAAAATCA 
117110 A CAAGCAGACT AGTAATTGTA GGGAAGTTAA 
117180 G ATGAAAAAAG GAGCGGGTTT ATGATCAAGT 
117250 C TTTTCGTGAC ATGAAACAAT GTGGAAAACA 



ATTTGAAATT GAAATAGAGA GGTTAAGTGA CGATCAAAC 
CATTTCAAAG GCACATAGAA ACGCTATATT AACCTCATA 
ATAATACTTC AATAATTGTT AAAAGGGGTT TAATGTGAT 
AT GAT TC AAA TTACGTAAAA AGACAATCGA ATATAATAT 
CAATTACTAA AACCAGTTTA GCACTAGGGC TTTTAACAA 
CAAAGCAGAA AAAATACAAT CAACTAAAGT TGACAAAGT 
ATAAACATAA CAGCAGGTGC AAATTCAGCG ACAACACAA 
AACTCGAAAA GGCACCAAAT ACTAATGAGG AAAAAACCT 
TAAACAAGAA GAGCAGAAAA CGCTTAATAT ATCAGCAAC 
ACAACCGAAT CTACAACGCA GCAAACTAAA ATGACAACA 
AATCTACTAA ATCAGACACA CCACAATCTC CAACCATAA 
TGAAGATTTA AGAGCGTATT ACACGAAACC GAGTTTTGA 
CCATGGACGA CGGTTAGGTT TATGAATGTT ATTCCAAAT 
AAGATGAGAA AAAATATAAA GATGGACCTT ACGATAATA 
ATATCAATTG AAAAAATATT CTGTCGGTGG CATCACGAA 
GAATTAAGCA TTACTAAAAA AGATAATCAA GGTATGATT 
AGGAAGAGAT TTCCTTGAAA GAGCTTGATT TTAAATTGA 
CGGTAACATG GGTTCAGGAA CAATCGTTAT TAAAATGAA 
AAAAAACTGC AAGAGCATCG TATGGCAGAC GT CAT AG AT 
TAAAATAATC ATGACATTCT CTAAATAGAA GCTGTCATC 
ACATGTTGCT TAGCTTCTTT TATTATGCGT AATGATGTA 
GGCATTTCTA TGTCT TAAAA GTGACGAAAC TTCAAATGT 
TATTTAACGA ACATTATGGA TTTCTTAATT TACTTAACG 
GTGAATGGAG CAATGCGCCA TCTATAATAA AGCTGTATG 
ATTACGAATG GAG CAT AC AA CTATGAAAAT AACAACAAT 
ACAACAGGTG TAATCACAAC GACAACGCAA GCAGCAAAT 
CACCGCAACA AGTAGCAAAT GCAACAACAC CATCTTCAA 
CGCGACAACA CCATCTTCAA C T AAAGT AG A AGCACCGCA 
ACTAAAGTGG AAGCACCGCA ACAAGCAGCA AACGCGACA 
CACCACAATC GCCAACCACA AAACAAGTAC CAACAGAAA 
TTATACGAAA CCAAGTTTAG AATTTAAAAA TGAGATTGG 
TTTATGAATG TTGTCCCAGA TTATTTCATA TATAAAATT 
GTGAAGGAGT ACATAGGAAT GTCGATGTAT TTGTCGTTT 
TTCTGTCGGT GGTATCACAA AGAGTAATAG TAAAAAAGT 
GAAGATAATA AAGGTACAAT CTCTCATGAT GTTTCAGAA 
AAGAACTTGA TTTTAAATTG AGAAAACAAC TTATTGAAA 
TAAAATTGTT ATTAAAATGA AAAACGGTGG AAAGTACAC 
CGCATGGCAG AT GT CAT AG A TGGCACTAAT ATTGATAAC 
CTCTAAATAG AAGCTGTCAT CGGAAAAACA AGAAGTTAA 
TTTGTTATGT TCGATGATTT GAGAACCCGA ATTTTCGAT 
TTTATCTGTA AATCCCTATC TATCGGGTGT GGAGCACAA 
ATTCCTTAAT TTACTTAATA ATGATTCAAT GATTATTAA 
AATTATAATA AAGCTGTATG ATTCAATAGA CGTAAGCGA 
AC TAT G AAAA TGGCAGCAAT TGCGAAAGCA AGTTTAGCA 
CATTGCATCA AACTGTAAAT GCGAGTGAAC ATGAAGCAA 
CTTAAGAGAT TACTATAGTG GCGCAAGTAA GGAACTTAA 
GGCAAGCATT ACCTTATCTT TGATAAACAT CAAAAGTTC 
AAAGATTTAA AGCACGCAAA AATCCGGGAT TAGACATAT 
CACAGTGTTT TCATATGGTG GTGTCACTAA GAAAAATCA 
AGATTTCAAA TCAAGAGAGA TGAAGGTGAC GGTATTGCT 
AAGAGATTTC ACTTAAAGAA CTCGACTTTA AATTGAGAC 
AAAGTTTCCT AAAGATAGTA AGATAAAAGT GATAATGAA 
AAAAAATTAC AAACAAATCG CATGAGTGAC GTCATTGAC 
TTAGATAATG CAATGAAATA TGGATAATAG TAAAATATG 
GCTTAGCTTC TTTTTTGTGT TGGAGAGATG AAAAAGAAG 
ACTTGTGGGA ATAGT TG AT A CCTATAGTCG CGCGTTGTC 
TAATTAAATT GAGGG AAAGT GTGAATAGTT AAAAAAGCT 
TAGGATTCTA TTAATTAGCT TAACATTGGT TCAAAAATA 
TCCGCTTATA TAATGATAGT AGATTGTTCG TATTACGTA 
CAAAATTTAT AAATAG AT TG GGAGAATAGT ACT AT G AAA 
TGGGATTGTT AGCTACTGGT GTAATAACAA CAGAAAGTC 
TCAACACAAT TATAAATCAT TAAAATACTA CTATAGCAA 
CTGTATAGAC AACATTTAAC TGATAAAGGT GCATATGTA 
TACTAGGTGA AGATAGTAAG AAATTCAAAT CAGATGTAT 
AGAAGAACCT GTTAAAGGAA GACAATATTC AATTGGCGG 
GAAAGAGAAG TCGATGTTAA AGTAACAAGA AAAGCAGAC 
TTAAAATTAC AAAAGAAGAA ATCTCGTTAA AAGAGTTAG 
AGAGAATTTA TACGATGCAA TTAACCATAG AAAAGGTAA 
TATACTTTCG AACTTACAAA AAAATTACAA CCGCATCGC 
AAGAAATTAA TGTTGAGCTA GAATATAAAT AATCTTTGG 
GCGATAACAT ATTGCTTAGC TTCTTTTTTG TTTTGTTAT 
TTTTGGAAAA ACGGTTGATA CTTATAGTCG CGCGTTGTC 
TAATTAAATT GAGGGAAAGT GTGAATAGTT AAAAAATTA 
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117320 G TATTGTGTTA TAAAAAATAA TTAATACTGT TAGGATTTCA TTAACTAACT TAACGTTGGT TCAAAAATA 
117390 G TTAAAAAGAG GTTAATTCAT AGCGCAGTAT CTCACTTATA TAATGATAGT AGATTGTTCG TATTACGTA 
1174 60 A TTGAATTAAT CATATAAAAA TATATTAAGA CAAAATTTAT AAATAGATTG GGAGAATAGT ACTGTGAAA 
117530 T TAAAAAOGTT AGCTAAAGCA ACATTGGCAT TAGGCTTATT AACTACTGGT GTGATTACAT CAGAAGGCC 
5 117600 A AGCAGTGCAA GCAAAAGAAA AGCAAGAGAG AGTACAACAT TTATATGATA TTAAAGACTT ACATCGATA 

117 670 C TACT CATC AG AAAGTTTTGA ATTCAGTAAT ATTAGTGGTA AGGTTGAAAA TTATAACGGT TCTAACGTT 
117740 G TACGCTTTAA CCAAGAAAAT CAAAATCACC AATTATTCTT ATCAGGAAAA GATAAAGATA AATATAAAG 
117810 A AGGCCTTGAA GGCCAGAATG TCTTTGTGGT AAAAGAATTA ATTGATCCAA ACGGTAGACT ATCTACTGT 
1178 80 T GGTGGTGTAA CGAAGAAAAA TAACCAATCT TCTGAAACTA ATACACCTTT ATT T AT AAAA AAAGTGTAT 
10 117950 G GCGGAAATTT AGATGCATCA ATTGAATCAT TTTTAATTAA TAAAGAAGAA GTTTCACTGA AAGAACTTG 

118020 A TTTCAAAATT AGACAACATT TAGTTAAAAA TTATGGTTTA TATAAAGGTA CGACTAAATA CGGTAAGAT 
118090 C ACTTTCAATT TGAAAGATGG AGAAAAGCAA GAAATTGATT TAG GTG AT AA ATTGCAATTC GAGCACATG 
118160 G GCGATGTGTT GAATAGTAAG GAT ATT C AAA ATATAGCAGT GACTATTAAT CAAATTTAAA GTAAGTAAT 
118230 C AATGACTCTA AAGTAATAAA TTTGAAGCAG CTTAGCGATG AAATGTTGAA TAGATACGTA CACCTTACA 
15 118300 T AAAGGAGCGT ATTTAAAACA ACCTTGTCGT TAGGCTTTTT TTACGTTTTA TAACGCAGGG TATGAGCGT 

118370 A CTAAAAATTC ACATTACTTC TGAAAGTGAT GTCCATTGAA TATTAATTAG TTCTTCATTA ACCATGATT 
1184 40 T AATTTTAATT AAACGAGTGT TAATGTCAGT CTGTCTCAAT GCCCTTTATA ATAAATGTGT ATTATTCAA 
118510 A TTACGTAATA AAAGCAATCC AATATATTAA GATTGGAGCA TAT GAATATG AAATTTACAG CGATAGCTA 
118580 A AGCGATATTT GTATTAGGAA TATTAACAAC AAGTGTAATG ATAACAGAAA ATCAATCGGT TAATGCAAA 
20 118650 A GGAAAGTATG AAAAAATGAA CCGTTTATAT GATACAAACA AGTTACATCA AT AC T ATTC A GGACCTAGT 

118720 T ATGAGTTAAC AAATGTTAGT GGCCAAAGTC AAGGTTATTA TGACTCTAAC GTTTTGCTTT TTAACCAAC 
118790 A AAATCAAAAG TTCCAAGTGT TTTTATTGGG AAAAG AT G AA AATAAATACA AAGAAAAAAC ACATGGTTT 
1188 60 A GATGTCTTTG CGGTACCGGA ATTAGTAGAT TTAGATGGAA GAATATTTAG TGTTAGTGGT GTAACAAAG 
118930A AAAATGTAAA ATCAATATTT GAGTCTCTAA GAACGCCGAA CTTACTAGTT AAAAAAATAG ACGATAAAG 
25 119000 A CGGTTTTTCG TAT GAT GAAT TTTTCTTTAT TCAAAAGGAA G AAGTAT C AT TGAAGGAACT TGATTTCAA 

119070 A ATAAGAAAAC TGTTAATTAA AAAATACAAA TTGTATGAAG GGGCAGCTGA TAAAGGTAGA ATTGTTATT 
119140 A ATATGAAAGA TGAAAATAAG TATGAAATTG ATTTAAGTGA TAAATTAGGT TTCGAGCGTA TGGCAGATG 
119210 T CATTAATAGT GAACAAATTA AAAAC AT C G A AGTGAATTTG AAATAATCAA T GAT AT AT AT AGAATGAAA 
119280 G CTTAAGAAGC GGTTTAATAA TCCCATGTTT AATGATTTTG ATACGTGTTT TAATAATAAA AACATATCG 
30 119350 A ACATTGACTA CGTTATTAAG CTGCTTTTTT GTACACTTTG TATCGAATAA CTTAAGATCT AAAACTAAT 

119420 C GGAAAGAACA ATGATTCCCC TAAAAAAATT TATGTTGCTA TTAAAAATCA GTTAATACGA ATGTTAACA 
119490 T ACGTTTGATT T TCATTAAT A ATGATTCAAG TTTATTTAAA TGAGCGTTAA TGTCAGTCTG TTTTGATGC 
119560 A CCTTATAATA AAGACAGATA GTTCAAATTA CGTAATAATA ACAATCCAAT ATATCAAGAT TGGAGCAAA 
119630 T AAAT AT GAAA TTTACGGCAT TAG CAAAAGC AACATTAGCA TTAGGAATAT TAACTACAGG TGTGTTTAC 
35 119700A ACAGAAAGTA AAGCTGTTCA CGCGAAAGTA GAACTTGATG AGACACAACG CAAATATTAT ATCAATATG 

119770 C TACATCAATA CTATTCTGAA GAAAGTTTTG AACCAACAAA TATTAGTGTT AAAAGCGAAG ATT AC TAT G 
119840 G CTCTAACGTT TTAAACTTTA AACAACGAAA TAAAGCTTTT AAAGTATTTT TACTTGGTGA CGATAAAAA 
119910 T AAATATAAAG AAAAAACACA TGGCCTTGAT GTCTTTGCAG TACCTGAATT AATAGATATA AAAGGTGGC 
119980A TATATAGCGT TGGCGGTATA ACAAAGAAAA ATGTGAGATC AGTGTTTGGA TTTGTAAGTA ATCCAAGTC 
40 120050 T ACAAGTTAAA AAAATCGATC CTAAACATGG CTTTTCGATA AATGAGTTGT TCTTTATTCA AAAGGAAGA 

120120 A GTATCGTTGA AGGAACTGGA TTTTAAAATA AGAAAAATGT TAGTCGAAAA AT AT AGAT T G TATAAAGGC 
120190 G CGTCAGATAA AGGTAGAATC GTTATTAATA TGAAAGACGA AAAGAAATAT GTAATTGATT TAAGTGAAA 
120260 A ATTAAGTTTT GATCGTATGT TTGATGTAAT GGATAGTAAG C AAAT T AAAA AT AT T GAAGT GAATTTGAA 
120330 T TAATTAAGTA TAATAACTTA AGAAGCGACT TAACGACAAA ATGTGAATTG ACATGCATGT CCTTAAATA 
120400 A GGAACTGTGT T AAAT AC AT T ACTGTTGTTA AGTTGTTTTT TGCGTTTCAA AGAGCAGAAC AGAGTAACA 
120470 T CATCAGTTGT AGTAAACGAT AATCTAGTAA AAC AAC T AAA TGAAATAATG AAATTCATTT AACCTGAAC 
120540 A TTAAAATATA TTTGTTTTTC ATTAAGAATA ATTCAAGTAT ATTTAAATCG AGGTTAATTA TCGTATGAA 
120 610 A CGATGCACGT TATAATAAAA ATGTATGATT CAAATTACGT AATGAAAACA ATCCAATATA TTAAGATTG 
120680 G AGCAAAACAA TATGAAATTA ACAGCGATAG CTAAAGCTGC ATTAGCTTTA GGAATTTTAA CAACAGGAA 
120750 C T TT AAC AAC A GAAGT TC ATT CAGGTCATGC AAAACAAAAT CAAAAGTCAG TAAATAAACA TGACAAGGA 
120820 A GCATTATACC GATACTACAC TGGAAAGACT ATGGAAATGA AAAATATTAG TGCTTTGAAA CATGGTAAA 
120890 A ATAACTTGCG TTTTAAGTTT AGAGGTATTA AGATTCAAGT TTTACTGCCT GGAAAT GAT A AAAGTAAAT 
120960 T TCAACAGCGT AGTTATGAGG GGTTAGATGT GTTTTTTGTT CAAGAAAAAA GAGATAAGCA CGATATATT 
121030 T TATACTGTTG GTGGTGTAAT ACAGAATAAT AAAAC AT CTG GAGTTGTCAG TGCACCAATA TTAAATATT 
121100 T CAAAAGAAAA GGGTGAAGAT GCTTTTGTGA AAGGTTACCC TT ATT AC ATT AAAAAAGAAA AAATAACAC 
121170 T AAAAGAGCTG GATTATAAGT TGAGAAAGCA TCTAATCGAA AAAT AT G GAC TT TAT AAAAC AATCTCAAA 
121240 A GATGGTAGGG TCAAAATTAG CTTGAAAGAT GGCAGTTTTT ATAACCTTGA TTTAAGATCT AAATTAAAA 
121310 T TCAAATATAT GGGGGAAGTC ATAGAAAGCA AAC AAAT TAA AGATATTGAA GTTAACTTAA AGTAAATAA 
121380 T TACGAATAAT AAAAAGTAAT TGAAGCGGCT TAACGATGAA AAGTAAATTG ATGCGCATAC CTTACCAAA 
121450 A GGATGCATCA ATCGATATCG TCGTTAAGCT GTTTTGGTTT ACGTTTCATG GATTCTACCC CAATTTTCA 
121520 T AAATATAAAA ATTCCACCAC CAACATCAAA ATTCTCAACA TCGCAACATA CCCAAATGTT ATAATAAAT 
121590 C TATTACACAA AGAGATAAAT TACTTATTCA AAGGCGGAGG AATCACATGT CTATTACTGA AAAACAACG 
121660 T CAGCAACAAG CTGAATTACA TAAAAAATTA TGGTCGATTG CGAATGATTT AAGAGGGAAC ATGGATGCG 
121730 A GTGAATTCCG TAATTACATT TTAGGCTTGA TTTTCTATCG CTTCTTATCC GAAAAAGCAG AACAAGAAT 
121800 A TGCAGATGCG TTGGCAGGTG AAGATATCAC GTATCAAGAG ACATGGGCAG ATGAAGAATA TCGTGAAGA 
121870 C TTAAAAGCTG AATTAATTGA TCAAGTCGGT TACTTCATTG AACCACAAGA TTTATTCAGT GCGATGATT 
121940 C GTGAAATTGA AACGCAAGAT TTCGATATCG AACATCTGGC GACGGCAATT CGTAAAGTTG AAACATCAA 
122010 C ATTAGGTGAA GAAAGTGAAA ATGACTTTAT CGGACTGTTC AGCGATATGG ACTTAAGTTC AACGCGACT 
122080 A GGTAACAATG TCAAAGAACG TACTGCTTTA ATCTCTAAAG TCATGGTTAA TCTTGACGAT TTACCATTT 
122150 G TTCACAGTGA TATGGAAATT GAT AT GTT AG GTGATGCATA TGAATTCCTA ATTGGGCGCT TTGCGGCGA 
122220 C AGCAGGTAAA AAAGCAGGCG AGTTCTATAC ACCACAACAA GTATCTAAGA TACTGGCGAA GATTGTCAC 
122290 A GACGGTAAAG ATAAATTACG TCATGTGTAC GACCCAACAT GTGGTTCCGG TTCATTATTG TTACGTC3TT 
122360 G GTAAAGAAAC GCAAGTGTAT CGTTATTTCG GTCAAGAACG TAACAATACC ACTTACAACT TAGCACGCA 
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122430 T GAACATGTTA TTACATGATG TACGTTATGA AAATTTCGAT ATCCGTAATG ATGACACGTT GGAAAATCC 
122500 A GCCTTTTTAG GACATACATT TGATGCGGTT ATTGCGAACC C AC CAT AC AG TGCGAAATGG ACAGCAGAT 
12257 0 T CAAAATTTGA AAATGACGAA CGATTCAGCG GATACGGCAA ACTTGCGCCA AAGTCCAAAG CAGACTTTG 

122 64 0 C CTTTATTCAA CACATGGTAC ATTACTTAGA CGATGAAGGT ACCATGGCCG TTGTACTCCC ACATGGTGT 
122710 C TTATTCCGTG GTGCTGCAGA AGGTGTCATT CGTCGTTATT TAATTGAAGA AAAGAACTAC TTAGAAGCC 
122780 G TGATTGGCTT ACCAGCCAAT ATTTTCTATG GGACAAGTAT TCCAACATGT ATTTTAGTAT TTAAAAAAT 
122850 G TCGCCAACAA GAAGACTATG T ATT AT T TAT CGATGCATCC AATGATTTTG AAAAAGGAAA AAATCAAAA 
122920 C CATTTAACCG ATGCCCAAGT CGAACGCATT ATTAACACAT ATAAGCGTAA GGAAACAATT GATAAATAT 
122990 A GCTACAGTGC GACATTACAA GAGATCGCCG ATAACGATTA CAACTTAAAC ATACCGAGAT ATGTCGATA 
123060 C ATTCGAAGAA GAAGCACCGA TTGATTTAGA TCAAGTCCAA CAAGATTTGA AAAATATCGA CAAAGAAAT 
123130 C GCAGAAGTTG AACAAGAAAT CAATGCATAC CTGAAAGAAC TTGGGGTGTT GAAAGATGAG TAATACACA 
123200 A AAGAAAAATG TGCCAGAGTT GAGGTTCCCA GGGTTTGAAG GCGAATGGGA AGAGAAGAAG TTAGGGGAC 
123270 C TTACTACCAA AATAGGTAGT GGAAAGACTC CCAAAGGTGG AAGTGAAAAC TATACAAACA AAGGCATAC 
123340 C ATTTTTAAGG AGTCAAAATA TTAGAAATGG TAAATTAAAT CTTAATGACT TAGTTTATAT TAGTAAAGA 
123410 T ATAGATGATG AGATGAAAAA TAGTAGAACG TACTATGGTG ATGTTCTTTT AAAT AT T AC A GGAGCATCA 
123480 A TAGGTAGAAC AGCCATTAAT TCGATAGTTG AAATACATGC TAATTTAAAT CAACATGTAT GTATTATTA 
123550 G ATTGAAAAAA GAGTATTATT ATAATTTTTT TGGACAGTAT CTATTATCAA GAAAAGGTAA AAGGAAAAT 

123 620 T TTCCTTGCAC AAAGTGGAGG TAGTCGAGAA GGACTAAACT TCAAAGAAAT TGCTAATTTA AAAATCTTC 

123 690 A CCCCAACTAT ATTTGAAGAG CAGCAAAAAA TAGGCGAATT CATCAGCAAA CTTGACCGAC AAATTGAAT 
123760 T AGAAGAACAA AAACTTGAAT TACTTCAGCA ACAGAAAAAA GGCTATATGC AGAAAATCTT CTCGCAAGA 
123830 A TTGCGATTCA AAGATGAGGA AGGTAAAGAT TAT C C AG AT T GGAAATCAAA AT C AATTCAA GAAATATTT 
123900 G AGAATAAGGG TGGCACTGCT C T AGAAAC AG AATTTAATTT TGACGGTAAT TATAAAGTTA TAAGTATAG 
12397 0 G AAGTTATTCT ATAAATAGCA CTTATAATGA TCAAAATATA AGAGTCAATA AAAATAAAAA AACTGAAAA 
12404 0 A TATATTTTAT CAAAAGGCGA CTTAGCAATG GTATTAAATG ATAAAACAAA AGATGGGAAA ATTATAGGT 
124110 A GAAGTATATT TATAGATAAA GATAATCAAT ATATTTATAA TCAAAGAACT GAAAGATTAA TACCATTTG 
124180 C TGAAAATGAT AATAAATTTT TATGGTTCTT AATGAATACA GATTTAATTA GAAATAAAAT AAAAGGTAT 
124250 G AT GC AAGGAG CAACCCAAGT TTATATAAAT TATTCATCTA TTAAATTGAT ATCTATACAA TTGCCACTT 
124320 C TTGAAGAACA ACAGAAAATA AGAGGGTTTC TAGAAGTTTT ATCTGGAATA ACTACTAAAC AATTGCACA 
124390 A GATAGACCAA TTAAAAGAGA GGAAAAAGGC GTTTTTACAG AAAAT GTTT A TTTGATTTGT CGCTGCAAT 
124460 A TAGTTTTTAT TATTTGTTTA TTTCAGATGT TTCACCATCA TATTGCGTAA CTTTTACAAA TAAGAAATA 
124530 A AGTTCAATGA AATCAAAAAC GAAC AT T AAA TTTAGGCACT GTGATAGCAC AGTGTCTTTT TTGTGTCGA 

124 600 A ATTGTGTACA GAATAAGTAG TTAAATAAAG ATTAAGTTGA GATAAAGTGT TATTCGTAAA TAAAAGAGA 
124 670 G TAGATCGATA GGAATTGAAT GATATTAGTT AACTATTTAT TAAATTACTT AATAATGATT AATTTTTAG 
124740 T TAAAGTAAGT TTAATGTGAA GCACGACCAT TGCTCATTAT AATGAATGAG GATTGTTCGT ATTGCGTAA 
124810 T AGAATAAATC AAATAGACTA AAAATTGGGA GCATAGAATT ATGAAATTAA AAAAT AT TGC TAAAGCAAG 
124880 T TTAGCACTAG GGATTTTAAC AACAGGGATG ATTACAACTA CTGCTCAGCC AGTAAAAGCA ATTGAGCAA 
124 950 A GCAGATTATC AGTTACTTCA AAAGATACAC AAGAATTAAA AAAATACTAC AGTGGAACAG GATATAATT 
125020 T TCAAAATGTG AGTGGTTATA GAGAAGGTAA TAAAATGAAC ATTATTGATG GACCACAACT TAATGTAGT 
125090 T ACTTTACTTG GCACAGACAA AGAAAGGTTT AAGGACGATG AAGATTATGA AGGACTTGAT GTATTTGTT 
125160 G TAAGAGAAGG GTCAGGTAAA CACGCAGATA AT AT AT C AAT TGGTGGAATT ACAAAAACAA ATAAGAATC 
125230 A ATATAAAGAC CCTGTACAAA ACGTTAATTT ATTGACTTCT AAGAGTAACG GTC AAAAT AC TGCTTCTGT 
125300 G ACTTCAGAAT ACTATAGCAT CAATAAAGAA GAAATTTCAT TAAAAGAACT TGATTTCAAA CTAAGAAAG 
125370 C AATTAATTGA TAAACATGAT CTTTATAAGA CAGAGCCTAA AGAC AG C AAA ATTAAAGTTT CTATGAAAA 
125440 A TGGCGGCTAC T AT AC GTTT G AATTAAATAA AAAATTACAG CCTCATCGCA TGGGTGATAC GATTGATAG 
125510 T AGAAATATAA AGAAAATTGA AGTGAATTTA T AAT AAT ATT CGAGGGAGTA TATCATGAGA GAAAATTTT 

SEQ ID No: 68 - SSL1 

gene ~ 110541 111236 

/gene-" MSSA-476ssIl" 
CDS 110541. .111236 

/gene="MSSA-4 7 6ssl 1 

/product=" (Similar to SET6 from strain Mu50)SSLl" 

/translation«"MGLKIMKFKAIAKASLALGMLATGVITSNVQSVQAKTEVKQQSEADLKLYYNGPSFEYKKVT 
GYGFIEGKDRFIDFIYNGQYNKISLVGSDKDKYNEEVNPDIDVHWREGNGRQADNHSIGGITKTNRGVYYDYIHT 
PILEIBCKGKEEPQSSLYQIYKEDISLKELDFKLRKQLISQSGLYSNGLKQGQITITMNDGTTHTIDLSQKLEKERM 
GESI DGRQI QKI LVEMK " 



SEQ ID No: 69 - SSL2 

gene 111522 112217 

/gene=" MSSA-47 6ssI2" 
CDS 111522. .112217 

~~ /gene="MSSA-4 7 6ssl2 

/products" (Similar to SET7 from strain N315)SSL2" 

/translation="MKMKSIVKISLLLGILATGVNTTTEKPVHAEPCKPIVISENSKKLKAYYTQPSIEYKNVTGYI 
SFIQPSIKFMNIIDGNSVNNIALIGKDKQHYHTGVHRHLNIFYVNEDKRFEGAKYSIGGITSANDKAVDLIAEARV 
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IKADHIGEYDYDFFPFKIDKEAMSLKEIDFKLRKYLIDNYGLYGEMSTGKITVKKKYYGKYTFELDKKLQEDRMSD 
VINVTDIDRIEIKVRKA'' 

5 SEQ ID No; 70 - SSL3 

gene 112508 113578 

/gene-" MSSA-47 6ss3" 
CDS 112508. .113578 

10 ~~ " /gene="MSSA-4 76ss23 

/product-" (Similar to SET 8 from strain Mu50)SSL3" 

/translation— "MKMRTITKTSLALGLLTTGAITVTTQSVKAEK^ 

TQAANTRQERTPKLEKAPNTNEEKTSASKIEKISQPKQEEQKTLNISATPAPKQEQSQTTTESTTQQTKMTTPPST 
15 NTPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFEFEKQFGFLLKPWTTVRFiynNVIPNRFIYKIALVG 
KDEKKYKDGPYDNIDVFIVLEDNKYQLKKYSVGGITKTNSKKVNHKVELSITKKDNQGMISRDVSEYMITKEEISL 
KELDFKLRKQLIEKHNLYGNMGSGTIVIKMKNGGKYTFELHKKLQEHRMADVIDGTNIDNIEVNIK" 

20 SEQ ID No: 71 - SSL4 

gene 113943 .. 1148 90 

/gene=" MSSA- 4 7 6ssl 4" 
CDS 113943. .114890 

25 ' ~ /gene-"MSSA-476ss!4 

/product-" (Similar to SET9 from strain N315)SSL4" 
/translation— "I^ITTIAKTSLALGLLTTGVITTTTQAANATTPPSTKVETPQQVANATTPSSTKVEAPQQAA 
NATTPSSTKVEAPQSKPNATTPSSTKVEAPQQAANATTPPSSNVDTSPPQSPTTKQVPTEINPKFKDLRAYYTKPS 
LEFKNEIGIILKKWTTIRFMNVVPDYFIYKIALVGKDDKKYGEGVHRNVDVFVV 
30 KVDHKAGVRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQLIEKNNLYGWGSGKIVIKMKNGGKYTFELH 
KKLQENRMADVI DGTN I DNIEVN IK" 

SEQ ID No: 72 - SSL5 

35 gene 115254 .. 115958 

/gene-" MSSA-47 6ssl5" 
CDS 115254. .115958 

/ gene- " MS S A- 4 7 6 s si 5 

/product-" (Similar to SET10 from strain Mu50)SSL5" 
40 /translation="MKMAAIAKASLALGILATGTITSLHQTVNASEHEAKYENVTKDIFDLRDY 

YSGASKELKNVTGYRYSKGGKHYLIFDKHQKFTRIQIFGKDIERFBCAI^PCNPGLDIFVVKEAENRNGTVFSYGGVTK 
KNQDAYYDYINAPRFQIKRDEGDGIATYGRVHYIYKEEISLKELDFKLRQYLIQNFDLYKKFPKDSKIKVIMKDGG 
Y YT FELNKKLQTNRMS DVI DGRN IEKIE AN IR " 

45 

SEQ ID No: 73 - SSL6 

gene ~~ ^ 116404 117102 

/gene-" MSSA-47 6s si 6 n 
CDS 116404. .117102 

50 /gene- "MSSA- 4 7 6 s si 6 

/product-" (Similar to SET2 from strain NCTC6571) SSL6" 

/ 1 rans la t ion= "MKLKALAKATLVLGLLATGVITTE S QTVKAAESTQGQHNYKSLKY YYSKPS IELINVDGLYR 
QHLTDKGAYWKNLKDYYIGLLGEDSKKFKSDVYGDLDAFLVIEEEPVKGRQYSIGGISKTOSKEFKEREVDVKVT 

RKADRDTTSTKDSKFKITKEEISLKELDFKLRQKLMKEENLYDAINHRKGKIWKMEDDKFYTFELTKKLQPHRMG 
DT I DGTKI KE INVELEYK " 



SEQ ID No: 74 - SSL7 

gene 117524 .. 118219 

/gene-" MSSA-47 Sssl 7" 

CDS 117524. .118219 

/gene="MSSA-47 6ssl7 

/product-" (Similar to SET1-C) SSL7 " 
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/translation-"MKLKT3aAKATLALGLLTTGVITSEGQAVQAKEKQERVQHLYDIKDLHRYYSSESFEFSNISG 

KVENYNGSNWRFNQENQNHQLFLSGKDKDKYKEGLEGQNVFWKELIDPNGRLSTVGGVTKKNNQSSETNTPLFI 

KKVYGGNLDASIESFLINKEEVSLKELDFKIRQHLVKNYGLYKGTTKYGKITFNLKDGEKQEIDLGDKLQFEHMGD 
5 VLNSKDIQNIAVTINQI" 



SEQ ID No: 75 - SSL8 

118558. .119256 
/gene=" MSSA-47 6ssl8" 
118558. .119256 
/ gene- "MS SA- 4 7 6 ssl 8 

/product*" (Similar to SET12 from strain Mu50)SSL8" 

/ trans la tion= "MKFTAIAKAI FVLGILTTSVMITENQSVNAKGKYEKMNRLYDTNKLHQYYSGPSYELTNVSG 
QSQGYYDSNVLLFNQQNQKFQVFLLGKDEKKYKEKTHGLDVFAVPELVDLDGRIFSVSGVTKKNVKSIFESLRTPN 
LLVKKIDDKDGFSYDEFFFIQKEEVSLKELDFKIRKLLIKKYKLYEGAADKGRIVINMKDENKYEIDLSDKLGFER 
MADVINSEQIKNIEVNLK" 



1U gene 
CDS 



SEQ ID No; 76 - SSL 9 

gene 119635 120333 

25 /gene=" MSSA-47 6ssl9" 

CDS 119635. .120333 

/gene="MSSA-47 6ssIS 

/product*" (Similar to SET13 from strain Mu50)SSL9" 

30 / translation— "MKFTALAKATLALGILTTGVFTTESKAVHAKVELDETQRKYYINMLHQYYSEESFEPTNIS^ 
KSEDYYGSNVLNFKQRNKAFKVFLLGDDKNKYKEKTHGLDVFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPS 
LQVKKIDPKHGFSINELFFIQKEEVSLKELDFKIRKMLVEKYRLYKGASDKGRIVINMKDEKKYVIDLSEKLSFDR 
MFDVMDSKQIKNIEVNLN" 

35 

SEQ ID No: 77 - SSL10 

gene ~ ~" 120 692 121375 

/gene=" MSSA-47 Sssll 0" 
CDS 120692. .121375 

40 /gene="MSSA-47 6ssll 0 

/product*" (Similar to SET14 from strain Mu50)SSL10" 

/translation* "MKLTAIAKAALALGILTTGTLTTEVHSGPIAKQNQKSVNKHDKEALYRYYTGKTMEMKNISAIj 
KHGKNNLRFKFRGIKIQVLLPGNDKSKFQQRSYEGLDVFFVQEKRDKHDIFYTVGGVIQNNKTSGWSAPILNISK 
45 EKGEDAFVKGYPYYIKKEKITLKELDYKLRKHLIEKYGLYKTISKDGRVKISLKDGSFYNLDLRSKLKFKYMGEVI 
ESKQIKDIEVNLK" 



SEQ ID No: 78 - SSL11 
50 gene 124 851 .. 125543 

/gene*" MSSA-4 7 6sslll" 
CDS 124851. .125543 

/gene*"MSSA-47 6sslIl 

/product*" (Similar to SET15 from strain Mu50) SSL11" 
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/trans lation="MKLKNIAKASLALGILTTGMITTTAQPVKAIEQSRLSVTSKDTQELKKYYSGTGYNFQNVSG 
YREGNKMNIIDGPQLNWTLLGTDKERFKDDEDYEGLDVFWREGSGKHADNISIGGITKTNKNQYKDPVQNVNLL 
TSKSNGQNTASVTSEYYSINfCEEISLKELDFKLRKQLIDKHDLYKTEPKDSKIKVSMKNGGYYTFELNKKLQPHRM 
GDTIDSRNIKKIEVNL" 

SEQ ID No: 79 



62000 C CTAAATTGTA AGCGCATACA AAATAAACAC AACCTACTAT TAAAATTTGT AATATTTTAT CAATAATTA 

62070 A ATGAACATTT TATTAATATT AAATTTAAGT AGTAGGAAAT AATTAAAATA AGTACTACAT TTAAAGTAT 

65 62140 A ACTATTTTTC AAGTAGTTAG AAAATTCAAT TATCAAACAA TTTAATGCAA TTGATTAGAG AATAATTGT 

62210 A ACGTGTCGTT TTTAATATAT AACTCCCGCC TACTTTATTA AGTACTGTTT CTGTCCAAAA CTTAAAAAT 
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62280 G ATAAGTTTTG CTTAAATAAC ACTACTAACT GTTTAAGTTT ATTTAACATA GTTTTAGCTT TTATTTAAT 
62350 T CCGAATCGGT GTAATAGCTT ATATACTTTG GGTAATTCAC GCAAAGGAGA TTTTCATATG AAAAAGAAC 
62420 A TCATGAATAA ATTAGTTTTA TCAACAGCAT TGTTACTTTT AGGAACTACA TCAACACAAC TTCCTAAAA 
62490 C ACCAATCAGT TTTTCATCTG AAGCAAAAGC CTATAATATC AGTGAAAACG AGACTAATAT CAATGAGTT 
5 62560 A ATAAAGTATT AT AC AC AGC C TCATTTATCA CTATCAAATA AATGGTTATG GCAAAAGCCC AATGGTAGC 

62630 A TTCATGCAAC ATTGCAAACG TGGGTTTGGT AT AGT C AT AT TCAAGTGTTT GGATCCGAGA GTTGGGGAA 
62700 A CATTAATCAG TTAAGAAATA AATACGTTGA TATATTTGGA ACTAAAGATG AGGACACAGT TGAAGGTTA 
62770 C TGGACTTATG AT G AAAC ATT TACTGGTGGT GTTACGCCAG CAGCTACTTC ATCTGATAAG CCTTATAGA 
62840 C TATTTTTAAA ATATAGTGAT AAACAACAAA CTATCATCGG TGGACATGAA TTTTACAAAG GAAATAAAC 

10 62910 C AGTATTAACT TTAAAAGAAT TAGATTTCCG TATTCGTCAA ACATTAATAA AAAATAAAAA GTTATATAA 

62980 C GGAGAATTTA AT AAAGGT C A AATTAAGATA ACTGCTGATG GAAATAATTA CACGATTGAT TTAAGTAAA 
63050 A AGTTAAAATT AACTGACACA AACCGTTATG TTAAAAATCC TAAAAATGCA CAAATTGAAG TCATACTCG 
63120 A AAAATCTAAC TaACCTATTA CCTTTTGTAA ATGCGGATAa TTTCAAttaT CTAATTAaCC CCTTTTATA 
63190 A TT AAAC ATT C CAacaaTACT CAAAGGAGaa AttCGAATga acAATaacaT CaCGaAAAAA ATTATTTTA 

15 63260 T CaaCAACATT GTTACTATTA GGTACAGCAT CTACACAATT TCCTAATACA CCTATCAATT CTTCATCTG 

63330 A AGCGAAAGCT TAT T AT AT AA ATCAAAACGA AACTAACGTT AATGAGTTAA CTAAATATTA CTCGCAAAA 
63400 A TATTTAACCT TCTCTAACAG TACGTTATGG CAAAAAGATA ACGGTACGAT TCATGCAACG TTGTTACAG 
63470 T TTTCTTGGTA TAGTCATATT CAAGTTTATG GACCTGAAAG TTGGGGCAAT ATCAACCAAT TAAGAAATA 
63540 A AAGCGTTGAT ATTTTTGGCA TAAAAGACCA AGAAACCATT GATTCTTTTG CATTATCTCA AGAAACGTT 

20 63610 T ACTGGTGGTG TTACTCCTGC AGCAACATCT AACGATAAAC ACTATAAACT GAATGTGACA TATAAAGAT 

636B0 A AAGCAGAAAC GTTTACTGGC GGATTTCCAG TT TATGAAGG CAATAAGCCT GTTTTAACTT TAAAAGAAT 
63750 T AGATTTTCGT ATTCGTCAAA CATTAATTAA AAGTAAAAAA TTATATAATA ATTCTTATAA TAAAGGACA 
63820 A ATTAAAATAA CAGGTGCAGA CAATAACTAC ACAATAGATT TAAGTAAAAG GTTGCCATCA ACTGATGCA 
638 90 A AT AG AT AT GT TAAAAAACCT CAAAATGCAA AAATTGAAGT TATCCTCGAA AAATC AAAC T AACAATAAT 

25 63960 A ATGGAGTTAA TAAAAATAAT CGCAAATACT ATATTGACTT CGCTCACATT TAAATTTCTT ATTCCTCGT 

64030 A TCATGATTCC TCTGAAAGGA GATGTTCTAA TGAGTAAGAA CATCACGAAA AATATAATTT TAACGACAA 
64100 C ATT AT TACT A TTAGGTACTG TATTACCTCA AAATCAAAAA CCAGTATTTA GTTTTTACTC TGAAGCTAA 
64170 A GCTTATAGCA TTGGTCAAGA TGAAACTAAC ATCAATGAAT TAATTAAATA TTACACACAG CCTCATTTT 
64240 T CATTTTCAAA TAAATGGCTA T ATCAAT AT G AT AAT GGAAA CATTTATGTT GAACTTAAGA GATATTCAT 

30 64310G GTCAGCACAT ATATCTTTAT GGGGCGCTGA AAGTTGGGGA AATATTAATC AGTTAAAAGG TCGTTACGT 

64380 A GATGTGTTTG GACTAAAAGA CAAAGATACT GATCAGTTAT GGTGGTCTTA TAGAGAGACA TTTACAGGT 
644 50 G GCGTTACACC AGCCGCAAAA CCTTCTGATA AAAC T T AT AA TCTTTTTGTG CAATACAAAG ATAAACTAC 
64 520 A AACGATTATT GGTGCGCATA AAAT AT AC C A AG GC AAT AAA CC AGT ATT AA CATTGAAAGA AATCGATTT 
64590 C CGTGCACGAG AAGCGTTAAT AAAAAATAAA ATATTATATA ACGAAAATCG TAATAAAGGT AAGCTTAAG 

35 64 660 A TCACCGGTGG CGGTAATAAC TACACTATTG ATTTAAGCAA AAGATTACAT TCAGATCTAG CAAATGTTT 

64730 A TGTTAAAAAT CCTAATAAAA TAACTGTTGA CGTCCTCTTT GATTAGTATA TGAAGGTGAC TTATACTTC 
64800 A TGCACTTTAA TTCCAAATCA GATTATTTAA ATGATAATTT TTAAAGTGTA TGATGTATAT AATAGGTAA 
64870 A ATTTTCTATA T AT TT AAAT G GAATTGGGAG TAGGAATGTG ACAGAAATAG TATTTTATAA AATTTATTT 
64 940 C GTTGTCACTC CCCAACTTGC ATTGTCTGTA GAATTTCTTT TTGAAATTCT CTATGTTGGG GCCCCGCCA 

40 65010 A CTTGCACATT ATTGTAAGCT GACTTTTCGT CAGCTTCTGT GTTGGGGCCC CGCCTATAAT TGAAAAATG 

65080 C TTGTTACATG GGCATTTTCA TTCGGTCAAC TACTACCAAT AT AAT AT T GT AGAGCCTAAG AC ATT GAT T 
65150 T ATTATGTCTT AGGCTCTATT CCTTCATTTA AT GATTC AAT TAT TAT AG C A ATACTTTATT GTCCCATGA 
65220 T TAGTGTTCTT T T AATG AG AC ATAGTAACTA TAAAGTTTAA TAATCGTTCT AAATC TAGCA TCTTCTAGT 
65290 T TGGTTTTCCC ATTTCTTAAA TCTTGTACTG TTTGATATGG AACTCCTGAA TTTTTCGAAA TTTTATAGC 

45 65360 C T GATTC AGAC TCAAaCAAAG ATT G AAG AG A GTTTATAATA TTATTCAACT CTGTCATTTT TAATCCCCT 

65430 T TCTTGAATTA ACAATATATA ATGTTGTTAT TAAAACAGTC AGTGTATGGA TGATTTCATT TCCTAAAAA 
65500 T 



50 

SEQ ID No: 80 - SSL12 

gene 62408.. 63133 

/gene=" MSSA-47 6ssl 12" 
CDS 62408.. 63133 

/gene="MSSA-47 6ss222 

/product*" (Similar to SAV1168 from strain Mu50)SSL12" 

/trans lation="MKKNIMNKLVLSTALLLLGTTSTQLPKTPISFSSEAKAYNISENETNINELIKYYTQPHLSL 
SNKWLWQKPNGSIHATLQTKTW^ 

SDKPYRLFLKYSDKQQTIIGGHEFYKGNKPVLTLKELDFRIRQTLIKNKKLYNGEFNKGQIKITADGNNYTIDLSK 
KLKLTDTNRYVKNPKNAQIEVILEKSN " 

SEQ ID No: 81 - SSL13 

gene 63227.. 63952 

/gene*" MSSA- 41 6 ssll 3" 
CDS 63227.. 63952 

/ gene— "MS SA- 41 6 ss 113 

/product*" (Similar to SA1010 from strain N315)SSL13" 
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/ trans la tion= "MNNNITKKIILSTTLLLLGTASTQFPNTPINSSSEAKAYYINQNETNVNELTKYYSQKYLTF 
SNSTLWQKDNGTIHATLLQFSWYSHIQVYGPESWGNINQLRNKSVDIFGIKDQETIDSFALSQETFTGGVTPAATS 
NDKHYKLNVTYKDKAETFTGGFPVYEGNKPVLTLKELDFRIRQTLIKSKKLYNNSYNKGQIKITGADNNYTIDLSK 
5 RLPSTDANRYVKKPQNAKIEVILEKSN " 



SEQ ID No: 82 - SSL14 
gene 64060.. 64776 

10 /gene= M MSSA-47 6ssll4" 

CDS 64060.. 64776 

/gene="MSSA-47 Sssll 4 

/product^" (Similar to SAV1166 from strain Mu50 ) SSL14" 

15 /translation^"MSKNITKNIILTTTLLLLGTVLPQNQKPVFSFYSEAKAYSIGQDETNINELIKYYTQPHFSF 

SNPCWLYQYDNGNIYVELKRYSWSAHISLWGAESWGNINQLKGRYVDVFGLKDKDTDQLWWSYRETFTGGVTPAAKP 

SDKTYNLFVQYKDKLQTIIGAHKIYQGNKPVLTLKEIDFRAREALIKNKILYNENRNKGKLKITGGGNNYTIDLSK 
RLHS DL AN VYVKN PNKIT VDVLFD " 

20 S. aureus strain COL taken from unpublished genome 

project at The Institute for Genomic Research (TIGR)via 
ViroloGenome 
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SEQ ID No: 83 



ATGAAATTTAAAGCGATAGCAAAAGCAAGTTTAGCATTGGGAATGTTAGCAACAGGTGTAATTACATCGAATGTACAATCAGTAC 

AAGCGAAAGCAGAAGTTAAACAACAAAGTGAATCAGAGTTAAAACACTATTATAATAAACCAATTTTAGAGCGTAAAAATGTGAC 

TGGATTTAAATATACTGATGAGGGTAAACACTATTTAGAAGTCACAGTAGGGCAACAGCATTCTCGAATCACTTTACTTGGATCT 

GATAAAGATAAATTTAAAGACGGAGAAAACTCAAATATAGATGTGTTTATCCTTAGAGAAGGTGACAGTAGACAAGCAACAAATT 

30 ACTCAATTGGTGGCGTTACAAAATCAAATAGTGTGCAGTATATTGATTATATCAATACGCCAATTTTAGAAATCAAGAAAGATAA 

TGAAGATGTACTTAAAGATTTTTACTACATTTCAAAAGAAGACATCTCATTAAAAGAACTTGATTATAGATTAAGAGAACGTGCG 

ATTAAACAACACGGCTTGTATTCAAATGGTCTTAAACAAGGTCAAATTACAATTACAATGAATGATGGCACAACACATACAATCG 

ATTTAAGTCAAAAACTTGAAAAAGAACGTATGGGTGAGTCAATCGACGGCACTAAGATTAATAAAATTCTAGTAGAAATGAAATA 
A 



SEQ ID No: 84 - SSLl 



gene 1..681 

/gene=" COhssll 11 
40 CDS 1..681 

/gene="COLss21 

/product^" (Similar to SET6 from strain N315)SSL1" 

/ translation— "MKFKAIAPCASLALGMLATGVITSNVQSVQAKAEVKQQSESELKHYYNKPILERKNVTGFKYT 

45 DEGKHYLEVTVGQQHSRITLLGSDKDKFKDGENSNIDVFILREGDSRQATNYSIGGVTKSNSVQYIDYINTPILEI 

KKDNEDVLKDFYYISKEDISLKELDYRLRERAIKQHGLYSNGLKQGQITITMNDGTTHTIDLSQKLEKERMGESID 
GTKINKI LVEMK " 

50 SEQ ID No; 85 

ATGAAATTAAAAAATATTGCTAAAGCAAGTTTAGCACTAGGGATTTTAACAACAGGGATGATTACAACTACTGCTC 
AGCCAGTAAAAGCAAGTACATTAGAGGTTAGATCACAAGCTACTCAAGACTTGAGTGAATATTATAATAGACCGTT 
CTTTGAGTATACAAATCAGTCAGGATATAAAGAGGAAGGAAAAGTGACGTTTACTCCTAATTATCAACTTATAGAT 

55 GTAACTTTAACTGGGAATGAAAAGCAAAATTTTGGTGAAGATATTTCTAATGTAGATATATTTGTTGTAAGAGAAA 
ATTCTGATAGATCTGGTAATACAGCTTCAATTGGTGGTATTACTAAAACAAACGGTTCAAATTATATTGATAAAGT 
AAAAGATGTAAATTTAAT AATTACT AAAAACAT CGATAGT GTTACAT CAACGT CAACAT CAT CT ACATAT ACAATT 
AATAAAGAAGAAATTTCATTAAAAGAACTTGATTTTAAATTAAGAAAGCATTTAATTGATAAACATAACCTTTATA 
AGACAGAACCTAAAGACAGTAAAATTCGAATTACTATGAAAGATGGTGGGTTCTACACATTTGAATTGAATAAAAA 

60 GTTACAAACACACCGTATGGGTGATGTTATTGATGGCAGAAATATAGAAAAAATTGAAGTGAATTTATAA 
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SEQ ID No: 86 - SSL2 

gene 1..678 
5 ~ /gene=" C0hssl2 n 

CDS 1..678 

/gene="C0LssZ2 

/product-" (Similar to SET7 from strain N315)SSL2" 
/trans lation="MKMKNIAKISLLLGILATGVNTTT^ 
10 SFIQPSIKFMNIIDGNSWNIALIGKDKQHYHTGVHRNL^^ 

IKEDHTGEYDYDFFPFKIDKEAMSLKEIDFKLRKYLIDNYGLYGEMSTGKI^ 
VTNVTDI DRIEIKVTKA" 



15 SEQ ID No: 87 

ATGAAAATGAGAACAATTGCTAAAACCAGTTTAGCACTAGGGCTTTTAACAACAGGCGCAATTACAGTAACGACGCAATCGGTCA 
AAGC AG AAAAAAT AC AAT C AAC T AAAGTT GAC AAAG T ACC AAC GC T T AAAGC AG AGCG AT T AGC AATG AT AAAC AT AAC AGC AG G 
TGCAAATTCAGCGACAACACAAGCAGCTAACACAAGACAAGAACGCACGCCTAAACTCGAAAAGGCACCAAATACTAATGAGGAA 

20 AAAAC C T C AGCT TC C AAAAT AG AAAAAAT ATC AC AACC T AAAC AAG AAGAGC AG AAAAC GCT T AAT AT AT C AGC AAC GC C AGC G C 
CTAAACAAGAACAATCACAAACGACAACCGAATCCACAACGCCGAAAACTAAAGTGACAACACCTCCATCAACAAACACGCCACA 
ACCAATGCAATCTACTAAATCAGACACACCACAATCTCCAACCATAAAACAAGCACAAACAGATATGACTCCTAAATATGAAGAT 
TTAAGAGCGTATTATACAAAACCGAGTTTTGAATTTGAAAAGCAGTTTGGATTTATGCTCAAACCATGGACGACGGTTAGGTTTA 
TGAATGTTATTCCAAATAGGTTCATCTATAAAATAGCTTTAGTTGGAAAAGATGAGAAAAAATATAAAGATGGACCTTACGATAA 

25 TATCGATGTATTTATCGTTTTAGAAGACAATAAATATCAATTGAAAAAATATTCTGTCGGTGGCATCACGAAGACTAATAGTAAA 
AAAGTT AATC AC AAAGT AG AAT T AAGC AT T AC T AAAAAAG AT AATC AAGGT AT G ATT T C ACGC G AT G T T TC AG AAT AC AT GAT T A 
CTAAGGAAGAGATTTCCTTGAAAGAGCTTGATTTTAAATTGAGAAAACAACTTATTGAAAAACATAATCTTTACGGTAACATGGG 
TTCAGGAACAATCGTTATTAAAATGAAAAACGGTGGGAAATATACGTTTGAATTACACAAAAAACTGCAAGAGCATCGTATGGCA 
GGCACTAATATTGATAACATTGAAGTGAATATAAAATAA 

30 

SEQ ID No: 88 - SSL3 

gene 1..1059 
35 " " /gene-" C0Lssl3" 

CDS 1..1059 

/ gene- " COLssI 3 

/products" (Similar to SET 8 from strain N315)SSL2" 

40 

/translation^"MKMRTIAKTSLALGLLTTGAITVTTQSVKAEKIQSTKVDKVPTLKAERLAMINITAGANSAT 
TQAANTRQERTPKLEKAPNTNEEKTSASKIEKISQPKQEEQKTLNISATPAPKQEQSQTTTESTTPKTKVTTPPST 
NTPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFEFEKQFGFMLKPWTTVRFMNVIPNRFIYKIALVG 
KDEKKYKDGPYDNIDVFIVLEDNKYQLKKYSVGGITKTNSKKWHKVELSITBCKDNQGMISRDVSEYMITKEEISL 
45 KELDFKLRKQLIEKHNLYGNMGSGTIVIKMKNGGKYTFELHKKLQEHRMAGTNIDNIEVNIK" 



SEQ ID No: 89 

50 atgaaaataacaacgattgctaaaacaagtttagcactaggccttttaacaacaggtgtaatcacaacgacaacgc 
aagcagcaaacgcgacaacaccatcttccactaaagtggaagcaccacaatcaacaccgccctcaactaaaataga 
agcaccgcaatcaaaaccaaacgcgacaacaccgccctcaactaaagtagaagcaccgcaacaaacagcaaatgcg 
acaacaccgccttcaactaaagtgacaacacctccatcaacaaacacgccacaaccaatgcaatctactaaatcag 
acacaccacaatcgccaaccacaaaacaagtaccaacagaaataaatcctaaatttaaagatttaagagcgtatta 

55 tacgaaaccaagtttagaatttaaaaatgagattggtattattttaaaaaaatggacgacaataagatttatgaat 
gttgtcccagattatttcatatataaaattgctttagttggtaaagatgataaaaaatatggtgaaggagtacata 
ggaatgtcgatgtatttgtcgttttagaagaaaataattacaatctggaaaaatattctgtcggtggtatcacaaa 
gagtaatagtaaaaaagttgatcacaaagcaggagtaagaattactaaggaagataataaaggtacaatctctcat 
gatgtttcagaattcaagattactaaagaacagatttccttgaaagaacttgattttaaattgagaaaacaactta 

60 ttgaaaaaaataatctgtacggtaacgttggttcaggtaaaattgttattaaaatgaaaaacggtggaaagtacac 
gtttgaattgcacaaaaaattacaagaaaatcgcatggcagatgtcattaatagtgaacaaattaaaaacatcgaa 
gtgaatttgaaa 

65 SEQ ID No: 90 - SSL4 

gene 1 . . 924 

/gene-" C0Lssl4" 
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10 



25 



40 



CDS 1..924 

/gene s ="COL5Sl4 

/products" (Similar to SET9 from strain N315)SSL2" 

/translation^"MKITTIAKTSIJ^GLLTTGVITTTTQAANATTPSSTKVEAPQSTPPSTKIEAPQSKPNATTP 

PSTKVEAPQQTANATTPPSTBCVTTPPSTNTPQPMQSTKSDTPQSPTTKQVPTEINPKFKDLRAYYTKPSLEFKNEI 

GIILKKWTTIRFMNWPDYFIYKIALVGKDDKKYGEGra^ 

VRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQ^^ 

MADVINSEQIKNIEVNLK" 

SEQ ID No: 91 



atgaaattaacaacgatagctaaagcaacattagcattaggaatattaactacaggtgtgtttacagcagaaagtc 
15 aaactggtcacgcgaaagtagaacttgatgagacacaacgcaaatattatatcaatatgctacatcaatactattc 
tgaagaaagttttgaaccaacaaacattagtgttaaaagcgaagattactatggctctaacgttttaaactttaaa 
caacgaaataaagcttttaaagtatttttacttggtgacgataaaaataaatataaagaaaaaacacatggccttg 
atgtctttgcagtacctgaattaatagatataaaaggtggcatatatagcgttggcggtataacaaagaaaaatgt 
gagatcagtgtttggatttgtaagtaatccaagtctacaagttaaaaaagttgatgctaaaaatggcttttcgata 
20 aacgagttgttttttattcaaaaggaagaagtatcattgaaggaactggactttaaaataagaaaactcttaatcg 
aaaaatatagattgtataaaggaacgtctgataaaggtagaattgttatcaatatgaaagacgaaaagaagcatga 
aattgatttaagtgaaaaattaagttttgaacgtatgtttgatgtaatggatagtaagcaaattaaaaatattgaa 
gtgaatttgaat 



SEQ ID No: 92 - SSL9 



1. . 696 



gene 

30 ~ /gene-" COLss25" 

CDS 1. . 696 

/gene="COLssl9" 

/product=" (Similar to SET13 from strain N315)SSL9" 

35 /translation— "MKLTTIAKATLALGILTTGVFTAESQTGHAKVELDETQRKYYINMLHQYYSEESFEPTNISV 
KSEDYYGSWLNFKQRNKAFKVFLLGDDKNKYKEKTHGLDVFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPS 
LQVKKVDAPCNGFSINELFFIQKEEVSLKELDFKIRKLLIEKYRLYKGTSDKGRIVINMKDEKKHEIDLSEKLSFER 
MFDVMDSKQIKNIEVNLN" 



SEQ ID No: 93 



Atgaaatttacagcattagcaaaagcgacattagctttaggaattttaacaacaggaactttaacaacagaagttc 
attcaggtcatgcaaaacaaaatcaaaagtcagtaaataaacatgacaaggaagcattataccgatactacactgg 

45 aaagactatggaaatgaaaaatattagtgctttgaaacatggtaaaaacaacttacgttttaagtttagaggtatt 
aagattcaagttttactgcctggaaatgataaaagtaaatttcaacagcgtagttatgaggggttagatgttttct 
ttgttcaagaaaaaagagataagcacgatatattttatactgttggtggtgtaatacagaataataaaacatctgg 
agttgtcagtgcaccaatattaaatatttcaaaagaaaagggtgaagatgcttttgtgaaaggttacccttattac 
attaaaaaagaaaaaataacactaaaagaactggattataagttgagaaagcatctaattgaaaaatacggacttt 

50 ataaaacaatctcaaaagatggtagggtcaaaattagcttgaaagatggcagtttttataaccttgatttaagatc 
taaattaaaatttaaatatatgggggaagtcatagaaagcaaacaaattaaagatattgaagttaacttaaag 



SEQ ID No: 94 - SLL10 

55 

gene 1 . . 681 

/gene= ,r COhssllO" 
CDS 1 • • 681 

/gene-"COLss210" 

60 /product-" (Similar to SET14 from strain N315)SSL10" 

/translation="MKFTALAKATLALGILTTGTLTTEVHSGPiAKQNQKSWKHDKEALYRYYTGKTMEMKNISAL 
KHGKNNLRFKFRGIKIQVLLPGNDKSKFQQRSYEGLDVFFVQEKRDKHDIFYTVGGVIQNNKTSGWSAPILNISK 
EKGEDAFVKGYPYYIKKEKITLKELDYKLRKHLIEKYGLYKTISKDGRVKISLKDGSFYNLDLRSKLKFKYMGEVI 
65 ESKQIKDIEVNLK" 
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SEQ ID No: 95 

5 atgaaattaaaaaatattgctaaagcaagtttagcactagggattttaacaacagggatgattacaactactgctc 
agccagtaaaagcaagtacattagaggttagatcacaagctactcaagacttgagtgaatattataatagaccgtt 
ctttgagtatacaaatcagtcaggatataaagaggaaggaaaagtgacgtttactcctaattatcaacttatagat 
gtaactttaactgggaatgaaaagcaaaattttggtgaagatatttctaatgtagatatatttgttgtaagagaaa 
attctgatagatctggtaatacagcttcaattggtggtattactaaaacaaacggttcaaattatattgataaagt 
10 aaaagatgtaaatttaataattactaaaaacatcgatagtgttacatcaacgtcaacatcatctacatatacaatt 
aataaagaagaaatttcattaaaagaacttgattttaaattaagaaagcatttaattgataaacataacctttata 
agacagaacctaaagacagtaaaattcgaattactatgaaagatggtgggttctacacatttgaattgaataaaaa 
gttacaaacacaccgtatgggtgatgttattgatggcagaaatatagaaaaaattgaagtgaattta 

15 

SEQ ID No: 96 - SSLll 

gene 1 . . 675 

/gene=" COLsslll" 
20 CDS 1..675 

/gene-"COLss2II" 

/product^" (Similar to SET15 from strain N315) SSLll" 



25 /trans lation="MKLKNIAKASLALGILTTGMITTTAQPVKASTLEVRSQATQDLSEYYNRPFFEYTNQSGYKE 
EGKVTFT PN YQL I DVTLTGNEKQNFGE D I SNVD I FWREN S DRS GNTAS I GGITKTNGSNYI DKVKDVNLI I TKN I 
DSVTSTSTSSTYTINKEEISLKELDFKLRKHLIDKHNLYKTEPKDSKIRITMKDGGFYTFELNKKLQTHRMGDVID 
GRNIEKIEVNL" 

30 SEQ ID No: 97 

atgaaaaagaacatcatgaataaattagttttatcaacagcattgttacttttagaaactacatcaacacaacttc 
ctaaaacaccaatcagtttttcatctgaagcaaaagcctataatatcagtgaaaacgagactaatatcaatgaact 
aatcaaatattacactcagccgcatttttcattatctggaaaatggttatggcaaaagcccaatggtagcattcat 

35 gcaacattgcaaacgtgggtttggtatagtcatattcaagtgtttggatccgagagttggggaaacattaatcagt 
taagaaataaatacgttgatatatttggaactaaagatgaggacacagttgaaggttactggacttatgatgaaac 
atttactggtggtgttacgccagcagctacttcatctgataagccttatagactatttttaaaatatagtgataaa 
caacaaactatcatcggtggacatgaattttacaaaggaaataaaccagtattaactttaaaagaattagatttcc 
gtattcgtcaaacattaataaaaaataaaaagttatataacggagaatttaataaaggtcaaattaagataactgc 

40 tgatggaaataattacacgattgatttaagtaaaaagttaaaattaactgacacaaaccgttatgttaaaaatcct 
cgtaatgcagaaattgaagtcatactcgaaaaatctaac 



SEQ ID No:98-SSL12 

45 

gene 1 . . 723 

/gene-" COLssl2" 
CDS 1..723 

/gene="C0LssI12 

50 /product^" (similar to SA1011) SSL12" 

/ trans la tion= "MKKNII^KLVXSTALLLLETTSTQLPKTPISFSSEAi<A.YNISENETNINELIKYYTQPHF 
SLSGKWLWQKPNGSIHATLQTWWYSHIQVFGSESWGNINQLRNKYVDIFGTKDEDTVEGYWTYDETFTGGVTPAA 
TSSDKPYRLFLKYSDKQQTIIGGHEFYKGNKPVLTLKELDFRIRQTLIKNKKLYNGEFNKGQIKITADGNNYTIDL 
SKKLKLTDTNRYVKNPRNAEIEVILEKSN" 

55 



SEQ ID No: 99 

60 atgaacaataacatcacgaaaaaaattattttatcaacaacattgttactattaggtacagcatctacacaatttc 
ctaatacacctatcaattcttcatctgaagcgaaagcttattatataaatcaaaacgaaactaacgttaatgagtt 
aactaaatattactcgcaaaaatatttaaccttctctaacagtacgttatggcaaaaagataacggtacgattcat 
gcaacgttgttacagttttcttggtatagtcatattcaagtttatggacctgaaagttggggcaatatcaaccaat 
taagaaataaaagcgttgatatttttggcataaaagaccaagaaaccattgattcttttgcattatctcaagaaac 

65 gtttactggtggtgttactcctgcagcaacatctaacgataaacactataaactgaatgtaacatataaagataaa 
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gcagaaacgtttactggcggatttccagtttatgaaggcaataagcctgttttaactttaaaagaattagattttc 
gtattcgtcaaacattaattaaaagtaaaaaattatataataattcttataataaaggacaaattaaaataacagg 
tgcagacaataactacacaatagatttaagtaaaaggttgccatcaactgatgcaaatagatatgttaaaaaacct 
^ caaaatgcaaaaattgaagttatcctcgaaaaatcaaac 

SEQ ID No: 100 - SSL13 

gene 1048124.. 1048849 

10 /gene-" COLss23" 

CDS 1048124.. 1048849 
/gene="COLssl23 

/product^" (similar to SA1010 from strain N315) SSL13 " 
/ translation="MNNNITKKIILSTTLLLLGTASTQFPNTPINSSSEAKAYYINQNETNVNELTKYySQKYL 
15 TFSNSTLWQKDNGTIHATLLQFSWYSHIQVYGPESWGNINQLRNKSVDIFGIKDQETIDSFALSQETFTGGVTPAA 

TSNDKHYKLNVTYKDKAETFTGGFPVYEGNKPVLTLKELDFRIRQTLIKSKKLYNNSYNKGQIKITGADNNYTIDL 
SKRLPSTDANRYVKKPQNAKIEVILEKSN" 

20 SEQ ID No: 101 

atgagtaagaacatcacgaaaaatataattttaacgacaacattattactattaggtactgtattacctcaaaatc 

aaaaaccagtatttagtttttactctgaagctaaagcttatagcattggtcaagatgaaactaacatcaatgaatt 

aattaaatattacacacagcctcatttttcattttcaaataaatggctatatcaatatgataatggaaacatttat 

25 gttgaacttaagagatattcatggtcagcacatatatctttatggggcgctgaaagttggggaaatattaatcagt 

taaaagatcgttacgtagatgtgtttggactaaaagacaaagatactgatcagttatggtggtcttatagagagac 

atttacaggtggcgttacaccagccgcaaaaccttctgataaaacttataatctttttgtgcaatacaaagataaa 

ctacaaacgattattggtgcgcataaaatataccaaggcaataaaccagtattaacattgaaagaaatcgatttcc 

gtgcacgagaagcgttaataaaaaataaaatattatataacgaaaatcgtaataaaggtaagcttaagatcaccgg 

tggcggtaataactacactattgatttaagcaaaagattacattcagatctagcaaatgtttatgttaaaaatcct 
aataaaataactgttgacgtcctctttgat 



30 



35 



60 



65 



SEQ ID No: 102 - SSL14 



gene 714 . . 714 

/gene=" C0LssI4" 
CDS 714. . 714 

/gene="COLssII4 

/product-" (similar to SA1009 from strain N315)SSL14" 
40 /trans 1 at ion=="MSBCNITKNIILTTTLLLLGTVLPQNQKPVFSFYSEAKAYSIGQDETNINELIKYYTQPHF 
SFSNKWLYQYDNGNIYVELKRYSWSAHISLWGAESWGNINQLKDRYVDVFGLKDKDTDQLWWSYRETFTGGVTPAA 

KPSDKTYNLFVQYKDKLQTIIGAHKIYQGNKPVLTLKEIDFRAREALIKNKILYNENRNKGKLKITGGGNNYTIDL 
SKRLHSDLANVYVKNPNKITVDVLFD" 

45 SEQ ID No: 103 

S. aureus strain N315 (SSL12 -SSL14 ) -coding strand 

50 



55 



1 


CTACTGTTCT 


61 


GTGCAGTATC 


121 


TTACATTTGC 


181 


CTTTTTCAAA 


241 


ATTTAGCACG 


301 


AGTCTAATAA 


361 


ATTTTAGTTA 


421 


TGTAATATTT 


481 


AATAATTAAA 


541 


AATTATCAAA 


601 


TATAACTCCC 


661 


TTGCTTAAAT 


721 


ATTCCGAATC 


781 


ATGAAAAAGA 


841 


ACATCAACAC 


901 


ATCAGTGAAA 


961 


TCATTATCTG 


1021 


ACGTGGGTTT 
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10 



15 



20 



25 



30 



35 



1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
2881 
2941 
3001 
3061 
3121 



CAGTTAAGAA 
TACTGGACTT 
AAACCTTATA 
GAATTTTACA 
CAAACATTAA 
ATAACTGCTG 
ACAAACCGTT 
AACTAACCTA 
ATAATTAAAC 
AAAAATTATT 
TACACCTATC 
CGTTAATGAG 
ATGGCAAAAA 
TATTCAAGTT 
TGATATTTTT 
GTTTACTGGT 
AACATATAAA 
GCCTGTTTTA 
AAAAT TAT AT 
CTACACAATA 
ACCTC AAAAT 
TTAATAAAAA 
CGTATCATGA 
ATTTTAACGA 
TTTAGTTTTT 
GAATTAATTA 
TATGATAATG 
TTATGGGGCG 
TTTGGACTAA 
GGTGGCGTTA 
AAAGATAAAC 
TTAACATTGA 
TATACCGAAA 
ATTGATTTAA 
AAAATAACTG 



ATAAATACGT 
AT GAT GAAAC 
GACTATTTTT 
AAGGAAATAA 
TAAAGAATAA 
AT GGAAATAA 
ATGTTAAAAA 
TTACCTTTTG 
ATTCCAACAA 
TTATCAACAA 
AATTCTTCAT 
TTAACTAAAT 
GATAACGGTA 
TATGGACCTG 
GGCATAAAAG 
GGTGTTACTC 
GATAAAGCAG 
ACTTTAAAAG 
AATAATTCTT 
GATTTAAGTA 
GCAAAAATTG 
TAATCGCAAA 
TTCCTCTGAA 
CAACATTATT 
ACTCTGAAGC 
AATATTACAC 
GAAAC ATT T A 
CTGAAAGTTG 
AAGACAAAGA 
CACCAGCCGC 
TACAAACGAT 
AAGAAATCGA 
ATCGTAATAA 
GCAAAAGATT 
TTGACGTCCT 



TGATATATTT 
ATTTACTGGT 
AAAATATAGT 
ACCAGTATTA 
AAAGTTATAT 
TTACACGATT 
TCCTAAAAAT 
TAAATGCGGA 
TACTCAAAGG 
CATTGTTACT 
CTGAAGCGAA 
ATTACTCGCA 
CGATTCATGC 
AAAGTTGGGG 
ACCAAGAAAC 
CTGCAGCAAC 
AAACGTTTAC 
AATTAGATTT 
ATAATAAAGG 
AAAGGTTGCC 
AAGTTATCCT 
TACTATATTG 
AGGAGATGTT 
ACTATTAGGT 
TAAAGCTTAT 
ACAGCCTCAT 
TGTTGAACTT 
GGGAAATATT 
TACTGATCAG 
AAAACCTTCT 
TATTGGTGCG 
TTTCCGTGCA 
AGGTAAGCTT 
ACATTCAGAT 
CTTTGATTAG 



GGAACTAAAG 
GGTGTTACGC 
GATAAACAAC 
ACTTTAAAAG 
AACGGAGAAT 
GATTTAAGTA 
GCACAAATTG 
TAATTTCAAT 
AGAAATTCGA 
ATTAGGTACA 
AGCTTATTAT 
AAAAT AT TT A 
AACGTTGTTA 
CAATATCAAC 
CATTGATTCT 
ATCTAACGAT 
TGGCGGATTT 
TCGTATTCGT 
ACAAATTAAA 
ATCAACTGAT 
CGAAAAATCA 
ACTTCGCTCA 
CTAATGAGTA 
ACTGTATTAC 
AGCATTGGTC 
TTTTCATTTT 
AAGAGATATT 
AATCAGTTAA 
TTATGGTGGT 
GAT AAAAC TT 
CATAAAATAT 
CGAGAAGCGT 
AAGATCACCG 
CTAGCAAATG 
TATATGAAGG 



AT GAGG AC AC 
CAGCAGCTAC 
AAACT AT CAT 
AATTAGATTT 
TTAATAAAGG 
AAAAGTTAAA 
AAGTCATACT 
TATCTAATTA 
ATGAACAATA 
GCATTTACAC 
AT AAAT C AAA 
ACCTTCTCTA 
CAOTTTTCTT 
CAAT TAAGAA 
TTTGCATTAT 
AAACACTATA 
CCAGTTTATG 
CAAACATTAA 
ATAACAGGTA 
GCAAATAGAT 
AACTAACAAT 
CATTTAAATT 
AGAACATCAC 
CTCAAAATCA 
AAG AT GAAAC 
C AAAT AAAT G 
CATGGTCAGC 
AAGATCGTTA 
CTTATAGAGA 
ATAATCTTTT 
ACCAAGGCAA 
TAATAAAAAA 
GTGGCGGTAA 
TTTATGTTAA 
TGACTTATAC 



AGTTGAAGGT 
TTCATCTGAT 
CGGTGGACAT 
CCGTATTCGT 
TCAAATTAAG 
AT T AAC T G AC 
CGAAAAATCT 
ACCCCTTTTT 
ACATCACGAA 
AATTTCCTAA 
AC GAAAC TAA 
ACAGTACGTT 
GGTATAGTCA 
ATAAAAGCGT 
CTCAAGAAAC 
AACTGAATGT 
AAGGCAATAA 
TTAAAAGTAA 
CAGACAATAA 
ATGTTAAAAA 
AATAATGGAG 
TCTTATTCCT 
GAAAAATATA 
AAAACCAGTA 
TAACATCAAT 
GCTATATCAA 
ACATATATCT 
CGTAGATGTG 
GACATTTACA 
TGTGCAATAC 
TAAACCAGTA 
TAAAATATTA 
TAACTACACT 
AAATCCTAAT 
TTCATGCACT 



40 



45 



50 



55 



60 



65 



70 



5EQ ID No. 104 

S. aureus strain NCTC8325 (SSL1-SSL11)- coding strand 



1 

61 

121 

181 

241 

301 

3 61 

421 

481 

541 

601 

661 

721 

781 

841 

901 

961 

1021 

1081 

1141 

1201 

1261 

1321 

1381 

1441 

1501 

1561 

1621 

1681 

1741 

1801 

1861 

1921 



TTGATTAAAA 
CTCAACATTT 
AATATGTGTG 
CTAATCTTTA 
AAAAAATTCT 
GTTGAAAATG 
TTAATCAGGT 
AAATAAAAAT 
TTAATAACAG 
GTTTAATTTG 
AT T G AAAAAT 
AAAGCAAGTT 
GTACAAGCGA 
AAACCAATTT 
TATTTAGAAG 
GATAAATTTA 
AGTAGACAAG 
ATTGATTATA 
GATTTTTACT 
GAACGTGCGA 
ATTACAATGA 
CGTATGGGTG 
TACTTTCTAA 
AAGCGACGTA 
ACAGTCGCAC 
AAAAC AG GTT 
TACGTAATAA 
ATTGCAAAAA 
AAACCAGTTC 
GCTTATTATA 
CAACCAAGTA 
ATTGGCAAAG 
GTTAATGAGG 



TAATCTAATT 
CTAACACCAA 
CCTAAATGAC 
TATACATAAC 
AATCTGTATT 
CCGTAATGAC 
ACTACGACAA 
GGTATAAAGT 
TTAGCATTTT 
AAAGGGGATA 
CAAATTTAAA 
TAGCATTGGG 
AAGCAGAAGT 
TAGAGCGTAA 
TCACAGTAGG 
AAGACGGAGA 
CAACAAATTA 
TCAATACGCC 
ACATTTCAAA 
TTAAACAACA 
ATGATGGCAC 
AGTCAATCGA 
CAACAAAGCG 
CTGTTGCTTA 
CACGCATTAT 
AATGTGAATA 
AACAATCTAA 
TAAGTTTGTT 
ATGC C GAAAA 
ATCAACCTAG 
T T AAAT T TAT 
AT AAG C AAC A 
ATAAGAGATT 



GTCGAACAGG 
TGTGAAAATG 
CTGTAGCACC 
ATAATACTTA 
TATTGTCGAC 
GCGTTTTAGT 
TATGATGTCT 
GTGATTTGTA 
ATTAATTACC 
GCGCCTCAAT 
TAGATTGGGG 
AATGTTAGCA 
TAAACAACAA 
AAATGTGACT 
GCAACAGCAT 
AAACTCAAAT 
CTCAATTGGT 
AATTTTAGAA 
AGAAGACATC 
CGGCTTGTAT 
AACACATACA 
CGGCACTAAG 
CTATGTTGAA 
GCTTCTTTTT 
CTTTTGCTTA 
TCCGAATACA 
TTATAATAGA 
ATTAGGAATA 
GAAACCTATT 
TATTGAATAT 
GAATATCATA 
TTATCATACG 
TGAAGGTGCA 



AACTTTTCCG 
ATCTATGTGA 
T GTT AAC AT A 
TTTGATGGTT 
GTGTATAGTA 
TGATGTGTAT 
GTTTTGTGTC 
TAAAAAAGAG 
TTAACAATGA 
ATAATGTAGG 
CTAAAAATTA 
ACAGGTGTAA 
AGTGAAT C AG 
GGATTTAAAT 
TCTCGAATCA 
ATAGATGTGT 
GGCGTTACAA 
ATCAAGAAAG 
TCATTAAAAG 
TCAAATGGTC 
ATCGATTTAA 
ATTAATAAAA 
TAGTGCTTGT 
TTGAGGGGAA 
AATAGCTTAA 
GCTCCTATAA 
TTGGAGCATA 
TTAGCAACAG 
GTAATAAGTG 
AAAAAT GTG A 
GATGGTAATT 
GGTGTACATC 
AAGTACTCTA 



CGCCAATCTT 
TTTGCAATGG 
ATATTCATTC 
T TC AAAAC AT 
AATACGTAAA 
CACTAATATC 
TGAAAGTTTT 
TCTCGACGGA 
TTCAAGTTTA 
TAGATTGTTC 
T G AAATTT AA 
TTACATCGAA 
AGTTAAAACA 
AT AC T GAT G A 
CTTTACTTGG 
TTATCCTTAG 
AATCAAATAG 
ATAATGAAGA 
AACTTGATTA 
TTAAACAAGG 
GTCAAAAACT 
TTCTAGTAGA 
TATGGAAATA 
AAGTTACAAA 
TCATATTTTA 
TATGGGTGTA 
C AAC TAT G AA 
GTGTAAACAC 
AAAATAGCAA 
C AG GTT AT AT 
CTGTTAATAA 
GTAATCTTAA 
TTGGGGGTAT 



CTGGAACTTT 
CTTGATTTGT 
ACTTCATCTC 
TTGATTTTAT 
TATTATTAAT 
ATTGAAAATT 
ACAGTTTTTA 
TAAGAATTGA 
GT T AAAT GAG 
ATATTACGTA 
AGCGATAGCA 
TGTACAATCA 
CTATTATAAT 
GGGTAAACAC 
AT CTGAT AAA 
AGAAGGTGAC 
TGTGCAGTAT 
TGTACTTAAA 
TAGATTAAGA 
TCAAATTACA 
TGAAAAAGAA 
AATGAAATAA 
TATGGAAGTT 
ACTCACACAA 
TGAATAGTTA 
T GAT TC AAAT 
AATGAAAAAT 
TACAACGGAA 
AAAAT TAAAA 
CAGTTTCATT 
TATTGCTTTA 
TATATTTTAC 
CACGAGTGCA 
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1981 


AACGATAAAG 


2041 


GAATATGATT 


2101 


ATTGATTTTA 


2161 


ACAGGAAAAA 


2221 


AAGTTACAAG 


2281 


ATCAAAGTTA 


2341 


GAGAGGTTAA 


2401 


AAAGG C AC AT 


2461 


CTTAATAATA 


2521 


TATAATGATG 


2581 


AGCATACAAT 


2641 


CAACAGGCGC 


2701 


AAGTTGACAA 


2761 


GTGCAAATTC 


2821 


AAAAGGCACC 


2881 


AACCTAAACA 


2941 


AACAATCACA 


3001 


CAACAAACAC 


3061 


TAAAACAAGC 


3121 


AACCGAGTTT 


3181 


GGTTTATGAA 


3241 


AGAAAAAATA 


3301 


ATAAATATCA 


3361 


TTAATCACAA 


3421 


AT GT TT C AG A 


3481 


TGAGAAAACA 


3541 


TTATTAAAAT 


3601 


ATCGTATGGC 


3661 


AAT CATGAC A 


3721 


GTTTACATGT 


3781 


TTGTTTGTAA 


3841 


GTTGAATCAC 


3901 


ACGATGATTC 


3961 


ATAAAGCTGT 


4021 


ACAACTATGA 


4081 


GGTGTAATCA 


4141 


GAAGCACCAC 


4201 


GCGACAACAC 


4261 


CCGCCTTCAA 


4321 


ACTAAATCAG 


4381 


AAAT TTAAAG 


4441 


GGTATTATTT 


4501 


ATATATAAAA 


4561 


AATGTCGATG 


4621 


GGTGGTATCA 


4681 


AAGGAAGATA 


4741 


CAGATTTCCT 


4801 


CTGTACGGTA 


4861 


ACGTTTGAAT 


4921 


AATATTGATA 


4981 


CATCGGAAAA 


5041 


TGTTCGATGA 


5101 


GAATTTATCT 


5161 


ACGAACATTA 


5221 


TAATGTGAAA 


5281 


GAACAAATCT 


5341 


GCAAGTTTAG 


5401 


AATGCGAGTG 


5461 


GATTACTATA 


5521 


GGTGGCAAGC 


5581 


GGTAAAGATA 


5641 


AAAGAAGCGG 


5701 


CAAGACGCTT 


5761 


GACGGTATTG 


5821 


GAACTCGACT 


5881 


CCTAAAGATA 


5941 


AATAAAAAAT 


6001 


ATAGAAGCCA 


6061 


AGAGGAGTTA 


6121 


AGCGTATCGA 


6181 


TCGCGCGTTG 


6241 


AGTGTGAATA 


6301 


CCATTAATTA 



CTGTCGACCT 
ATGACTTTTT 
AATTAAGAAA 
TTACAGTCAA 
AAGACCGTAT 
TAAAAGCATA 
GTGACGATCA 
AGAAACGCTA 
CTTCAATAAT 
TTGTATGATT 
TATGAAAATG 
AATTACAGTA 
AGTACCAACG 
AGCGACAACA 
AAAT AC T AAT 
AGAAGAGCAG 
AACGACAACC 
GCCACAACCA 
ACAAACAGAT 
TGAATTTGAA 
TGTTATTCCA 
TAAAGATGGA 
ATTGAAAAAA 
AGTAGAATTA 
AT AC AT GAT T 
ACTTATTGAA 
GAAAAACGGT 
AGACGTCATA 
TTCTCTAAAT 
TGCTTAGCTT 
AAGTGGCATT 
AT C AAAAT C A 
AAAT AT AGT T 
AT G ATT C AAT 
AAATAACAAC 
CAACGACAAC 
AATCAACACC 
CGCCCTCAAC 
CTAAAGTGAC 
ACACACCACA 
ATTTAAGAGC 
TAAAAAAATG 
TTGCTTTAGT 
TATTTGTCGT 
CAAAGAGTAA 
ATAAAGGTAC 
TGAAAGAACT 
ACGTTGGTTC 
TGCACAAAAA 
ACATTGAAGT 
ACAAGAAGTT 
TTTGAGAACC 
GTAAATCCCT 
TAGATTCCTT 
GGTCAAATAC 
AAT AAT T AC G 
CATTAGGTAT 
AACATAAAGC 
GTGGCGCAAG 
ATTACCTTAT 
TTGAAAGATT 
AAAACCGTAA 
ATT AT G ATT A 
CTACGTACGG 
TTAAATTGAG 
GTAAGATAAA 
TACAAACAAA 
ACATTAGATA 
GGCAACATAA 
TGAATAATAA 
TCCTTTTCGT 
GTTAAAAAAG 
GCTTAACATT 



AATAGCAGAA 
CCCATTTAAA 
ATACCTTATT 
AAAGAAATAC 
GTCCGATGTT 
ACACATATAC 
AACGTTGCTT 
TATTAATCTC 
TGTTAAAAGG 
CAAATTACGT 
AGAACAATTG 
ACGACGCAAT 
CTTAAAGCAG 
CAAGCAGCTA 
GAGGAAAAAA 
AAAACGCT T A 
GAATCCACAA 
ATGCAATCTA 
ATGACTCCTA 
AAGCAGTTTG 
AATAGGTTCA 
CCTTACGATA 
TATTCTGTCG 
AGCATTACTA 
ACTAAGGAAG 
AAACATAATC 
GGGAAATATA 
GATGGCACTA 
AGAAGCTGTC 
CTTTTATTAT 
TCTATGTCTT 
TTTTTATTTA 
AAACAAGGTT 
GAATGTAATC 
GATTGCTAAA 
GCAAGCAGCA 
GCCCTCAACT 
TAAAGTAGAA 
AACACCTCCA 
ATCGCCAACC 
GTATTATACG 
GACGACAATA 
TGGTAAAGAT 
TT T AGAAGAA 
TAGTAAAAAA 
AATCTCTCAT 
TGATTTTAAA 
AGGTAAAATT 
ATTACAAGAA 
GAATATAAAA 
AAGTGACAAC 
CGAATTTTCG 
ATCTATCGGG 
AATTTACTTA 
GCTAACTATA 
AATGGAGCAT 
TTTAGCAACA 
AAAATATGAA 
TAAGGAACTT 
CTTTGATAAA 
TAAAGCACGT 
TGGCACAGTG 
TATAAACGCA 
TAGAGTACAC 
ACAGTATTTA 
AGTGATAATG 
TCGCATGAGT 
ATTCAATGAA 
GTTGCTTAGC 
AAACACCAAT 
GACATGAAAC 
CTGCGTTAAG 
GGTTCAAAAA 



GCAAGAGTTA 

ATAGATAAAG 

GATAATTATG 

TATGGAAAGT 

ATCAATGTCA 

TTGATGACGA 

AACTTCTTTT 

ATACTCACTC 

GGTTTAATGT 

AAAAAGACAA 

CTAAAACCAG 

CGGTCAAAGC 

AGCGATTAGC 

ACACAAGACA 

CCTCAGCTTC 

ATATATCAGC 

CGCCGAAAAC 

C TAAAT C AG A 

AATATGAAGA 

GATTTATGCT 

TCTATAAAAT 

ATATCGATGT 

GTGGCATCAC 

AAAAAGATAA 

AGATTTCCTT 

TTTACGGTAA 

CGTTTGAATT 

ATATTGATAA 

ATCGGAAAAA 

GCGTAATGAT 

AAAAGTGACG 

ACGAACATTA 

TAATGTGAAT 

GAACAAATCT 

ACAAGTTTAG 

AACGCGACAA 

AAAATAGAAG 

GCACCGCAAC 

TCAACAAACA 

ACAAAACAAG 

AAACCAAGTT 

AGATTTATGA 

GATAAAAAAT 

AATAATTACA 

GTTGATCACA 

GATGTTTCAG 

TTGAGAAAAC 

GTTATTAAAA 

AATCGCATGG 

T AAT CATGAC 

GGCCTACATG 

ATGGGTCCAA 

TGTGAAGCAC 

ATAATGATTC 

ATAAAGCTGT 

ACAACTATGA 

GGAACAATAA 

AATGTGACAA 

AAAAATGTTA 

AATAGAAAAT 

AAAAATCCGG 

TTTTCATATG 

CCAAGATTTC 

TACATTTATA 

ATTCAAAATT 

AAAGATGGCG 

GACGTCATTG 

ATATGGATAA 

TTCTTTTTTG 

AAAACTTGTG 

AATGTGGAAA 

TTTAAAAAAT 

TAGTTAAAAA 



TTAAAGAAGA 

AAGCGATGTC 

GTCTTTACGG 

ATACATTTGA 

C AG AT ATT GA 

AATAAGTTGA 

TAATGCTTAA 

ATTATTTTTT 

GATTATCTTA 

TCGAATATAA 

TTTAGCACTA 

AGAAAAAATA 

AATGATAAAC 

AGAACGCACG 

CAAAATAGAA 

AACGCCAGCG 

TAAAGTGACA 

CACACCACAA 

TTTAAGAGCG 

CAAACCATGG 

AGCTTTAGTT 

ATTTATCGTT 

GAAGACTAAT 

TCAAGGTATG 

GAAAGAGCTT 

CATGGGTTCA 

ACACAAAAAA 

CATTGAAGTG 

CAAGAAGTTA 

GTAAAAAGAC 

AAACTTCAAA 

TGGATTTCTT 

GGAGCAATAC 

AATAATTACG 

CACTAGGCCT 

CACTATCTTC 

CACCGCAATC 

AAACAGCAAA 

CGCCACAACC 

TACCAACAGA 

TAG AAT T T AA 

ATGTTGTCCC 

ATGGTGAAGG 

ATCTGGAAAA 

AAGCAGGAGT 

AATTCAAGAT 

AACTTATTGA 

TGAAAAACGG 

CAGATGTCAT 

ATTCTCTAAA 

TTGCTTAGCT 

ATATGACGTG 

AACGGGATCA 

AAT G ATT ATT 

ATGATTCAAT 

AAATGACAGC 

CGTCATTGCA 

AAGATATCTT 

CTGGTTATCG 

TCACAAGAGT 

GATTAGACAT 

GTGGTGTCAC 

AAATCAAGAG 

AAGAAGAGAT 

TTGATCTGTA 

GC T AT TAT AC 

ACGGTAGAAA 

TAGTAAAATA 

TGTTGGAGAG 

GAAATAGTTG 

ACATAGTTAA 

AGATTAACGC 

GAGGTTAATT 



TCATACTGGT 

ATTGAAAGAG 

T GAAA.T GAG T 

ATTGGATAAA 

TAGAATTGAA 

AATTGAAATA 

AAATTATTTC 

GCTTAAATTA 

GAACGCCATC 

TATAGATTGG 

GGGCTTTTAA 

CAATCAACTA 

ATAACAGCAG 

CCTAAACTCG 

AAAATATCAC 

CCTAAACAAG 

ACACCTCCAT 

TCTCCAACCA 

TAT TATA CAA 

ACGACGGTTA 

GGAAAAGATG 

TTAGAAGACA 

AGTAAAAAAG 

ATTTCACGCG 

GAT TTT AAAT 

GGAACAATCG 

CTGCAAGAGC 

AATATAAAAT 

AGTGACAACG 

GAATATTCAT 

TGTGCCAAGT 

AATTTACTTA 

GCCATCTATA 

AATGGAGCAT 

TTTAACAACA 

CACTAAAGTG 

AAAACCAAAC 

TGCGACAACA 

AATGCAATCT 

AATAAATCCT 

AAATGAGATT 

AGATTATTTC 

AGTACATAGG 

ATATTCTGTC 

AAG AAT TACT 

TACTAAAGAA 

AAAAAATAAT 

TGGAAAGTAC 

AGATGGCACT 

TAGAAGCTGT 

TCTTTTGTTA 

GAAGAGACCT 

GTTTTATTTA 

AAACATGGTT 

AGACGTAAGC 

AATTGCGAAA 

TCAAACTGTA 

TGACTTAAGA 

TTATAGCAAA 

ACAGATATTT 

ATTTGTTGTT 

TAAGAAAAAT 

AGATGAAGGT 

TTCACTTAAA 

TAAAAAGTTT 

GTTTGAACTT 

TATTGAAAAA 

TGGATAGTAT 

ATGAAAATGA 

ATACTTATAG 

ATTGAGGGAA 

TGTTAGGATT 

CATAGCTTAG 
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TATTACGCTT 
AAATATATTA 
GTTAGCTAAA 
TCAAACAGTA 
CTACTATAGC 
GACAGATAAA 
TAAAGATATT 
CGAGGAGGAA 
TAAAGAATTT 
AAAGTCTAAA 
CTTTAAATTA 
AAAAGGTAAA 
AAAACTACAA 
TGTTGAGCTA 
GCGATAACAT 
ATGATCAAGT 
ATGAAACAAT 
TATTGTGTTA 
TCAAAAATAG 
AGATTGTTCG 
AATAGATTGG 
AGGCTTATTA 
G C AAGAGAG A 
AAGTTTTGAA 
ACGCTTTAAC 
ATATAAAGAA 
CGGTAGATTA 
TACACATTTA 
TTCAATTAAT 
AGTTAAAAAT 
GAAAGATGGA 
TGATGTGTTG 
TAAGTAATCA 
AGATACGTAC 
TTACGTTTTA 
TGATGCCCAT 
GTGTTAATGT 
AATAAAAGCA 
CTAAAGCGAT 
CGGTTAATGC 
ATCAATACTA 
ATTATGACTC 
TGGGAAAAGA 
CAGAATTAGT 
T AAAAT C AAT 
AAGACGGTTT 
AACTTGATTT 
CTGATAAAGG 
GTGATAAATT 
TCGAAGTGAA 
ATAATCCCAT 
AC T ACGTT AT 
AATCGGAAAG 
TACGAATGTT 
GTTAATGTCA 
TAATAACAAT 
AAAGCAACAT 
GGTCACGCGA 
CAATACTATT 
TATGGCTCTA 
GGTGACGATA 
GAATTAATAG 
AGATCAGTGT 
AATGGCTTTT 
CTGGACTTTA 
GATAAAGGTA 
GAAAAATTAA 
GAAGTGAATT 
AAAATGTGAA 
TTAAGTTGTT 
ATAATCCGGT 
TATTTGTTTT 
AAC GAT G C AC 



ATATAATGAT 
AGACAAAATT 
GC AAC AT TAG 
AAAGCGGCAG 
AAGCCAAGTA 
GGAGTATATG 
GAAAAATACC 
ACTGTTAATG 
AGTAAAGAAG 
GATAGTAAAT 
AGAAAAAAAT 
ATTGTAGTTA 
CCGCATCGCA 
GAATATAAAT 
ATTGCTTAGC 
TTTTGGAAAA 
GTGGAAAACA 
TAAAAAATAA 
TTAAAAAGAG 
TATTACGTAA 
GAGAATAGTA 
ACTACTGGTG 
GTACAACATT 
TTCAGTAATA 
CAAG AAAAT C 
GGCATTGAAG 
TCTACTGTTG 
TTTGTTAATA 
AAAGAAGAAG 
TATGGTTTAT 
GAAAAGCAAG 
AATAGTAAGG 
ATGACTCTAA 
ACCTTACATT 
TAACGCAGGT 
TGAATATTAA 
CAGTCTGTCT 
AT C C AATAT A 
ATTTATATTA 
AAAAGGAAAG 
TTCAGGACCT 
TAACGTTTTG 
TG AAAAT AAA 
AGATTTAGAT 
ATTTGAGTCT 
TTCTATTGAT 
TAAAATAAGA 
TAGAATTGTT 
AGATTTCGAG 
TTTGAAATAA 
GTTTAATGAT 
TAAGCTGCTT 
AACAATGATT 
AAAATACGTT 
GTCTGTTTTG 
CCAATACATT 
TAGCATTAGG 
AAGTAGAACT 
CTGAAGAAAG 
ACGTTTTAAA 
AAAATAAATA 
ATATAAAAGG 
TTGGATTTGT 
CGATAAACGA 
AAATAAGAAA 
GAATTGTTAT 
GTTTTGAACG 
TGAATTAGTT 
TTGGCATTTG 
TTTGTAATTC 
AAAAC AAC T A 
TCATTAAGAA 
GTTATAATAA 



AGTAGATTGT 

TATAAATAGA 

TATTGGGATT 

AATCAACTCA 

TAGAGTTAAA 

TTTGGAAGGA 

CTCAAGGTGA 

GAAGACAATA 

TCGATGTTAA 

TTAAAATTAC 

T GAT GGAAG A 

AAATGGAAGA 

TGGGTGACAC 

AATCTTTGGA 

TTCTTTTTTA 

ACGGTTGATA 

TAATTAAATT 

TTAATACTGT 

GTTAATTCAT 

TTGAATTAAT 

CTGTGAAATT 

TGATTACATC 

TAT AT GAT AT 

TTAGTGGTAA 

AAAATCACCA 

GCAAAGATGT 

GTGGTGTGAC 

AAGTGTATGG 

TTTCACTGAA 

ATAAAGGTAC 

AAATTGATTT 

ATATTAATAA 

AGTAATAAAT 

AAGGAGCGTA 

TAT GAGT GAG 

TTAGCTCTTC 

CAATGCCCTT 

TTAAGATTGG 

GGAATATTAA 

TATGAAAAAA 

AGTTATGAGT 

CTTTTTAACC 

TACAAAGAAA 

GGAAGAATAT 

CTAAGAACGC 

GAATTTTTCT 

AAACTGTTGA 

ATTAATATGA 

CGTATGGCAG 

T C AATG AT AT 

TTTGATACGT 

TTTTGTACAC 

CACCAAAAAA 

TGATTTTCAT 

ATGCACCTTA 

AAGATTGGAG 

AATATTAACT 

TGATGAGACA 

TTTTGAACCA 

CTTTAAACAA 

TAAAGAAAAA 

TGGCATATAT 

AAGTAATCCA 

GTTGTTTTTT 

ACTCTTAATC 

CAATATGAAA 

TATGTTTGAT 

CGAGTTAATA 

TGTCCTTATA 

AAAG AG C AG A 

AATGAAATAA 

TAATTCAAGT 

AAATGTATGA 



TCGTATTACG 

TTGGGAGAAT 

GTTAGCTACT 

AGGTCAACAC 

AAATCTTGAT 

TCGAAAAGAT 

GCATGATAAG 

TTCAATTGGT 

AGTAACAAGA 

TAAAGAAGAA 

AGAAAAATTA 

TGATAAGTTT 

GATAGATGGT 

CAAGCAGACT 

TTTTGTTATG 

CTTATAGTCG 

GAGG GAAAGT 

TAGGATTTCA 

AGCGCAGTAT 

CATATAAAAA 

AAAAACGTTA 

AGAAGGCCAA 

T AAAG AC T T A 

GGTTGAAAAT 

ATTATTCTTA 

CTTTGTGGTA 

TAAGAAAAAT 

CGGAAATTTA 

AGAACTTGAT 

GAC TAAATAC 

AGGTGATAAA 

GATTGAAGTG 

TTGAAGCAGC 

TTTAAAACAA 

CATACTAAAA 

ATTAACAATG 

TAT AAT AAAG 

AGCACATGAA 

CAACAAGTGT 

TGAACCGTTT 

TAACAAATGT 

AACAAAATCA 

AAAC AC AT GG 

TTAGTGTTAG 

C GAACTTACT 

TTATTCAAAA 

TTAAAAAATA 

AAG AT GAAAA 

ATGTC AT TAA 

AT AT AG AAT G 

GTTTTTATAA 

T TT AT AAC C A 

ATATTTATGT 

T AAT AAT GAT 

TAATAAAGAC 

CAAAAAAATA 

ACAGGTGTGT 

CAACGCAAAT 

ACAAACATTA 

CGAAATAAAG 

ACACATGGCC 

AGCGTTGGCG 

AGTCTACAAG 

ATTCAAAAGG 

GAAAAATATA 

GACGAAAAGA 

GTAATGGATA 

GCATAATAGC 

TAAGGAACTG 

ACAGAGTAAC 

TGAAAGTCAT 

ATATTTAAAT 

TTCAAATTAC 



TAATTGAAAT 

AGTACTATGA 

GGTGTAATAA 

AATTATAAAT 

GGTTTGTATA 

TATTTTGTTG 

CAAGATGCAT 

GGTTTAAGTA 

AAAATTGATG 

ATCTCGTTAA 

TATGGTGCTG 

TATACTTTCG 

ACC AAAAT CA 

AGTAATTGTA 

ATGAAAAAAG 

CGCGTTGTCC 

GTGAATAGTT 

TTAACTAACT 

CTCACTTATA 

TATATTAAGA 

GCTAAAGCAA 

GCAGTTCAAG 

CATCGATACT 

TATAACGGTT 

TTAGGTAAAG 

AAAGAATTAA 

AACAAATCTT 

GATGCATCAA 

TTCAAAATTA 

GGTAAGATCA 

TTGCAATTCG 

ACTTTGAAAC 

TTAACGATGA 

CCTTGTCGTT 

ATTTACATTG 

ATTTAAGTTT 

GTGTATTATT 

TATGAAATTT 

AATGATAACA 

AT AT GAT AC A 

TAGTGGCCAA 

AAAGTTCCAA 

TTT AG ATGTC 

TGGTGTAACA 

AGTTAAAAAA 

GGAAGAAGTG 

CAAACTGTAT 

TAAGTATGAA 

TAGTGAACAA 

AAAGCTTAAG 

TAAAAACATA 

ATAGCTTAAG 

TGCTATTAAA 

TCAAGTTTAT 

AGATAGTTCA 

TGAAATTAAC 

TTACAGCAGA 

ATTATATCAA 

GTGTTAAAAG 

CTTTTAAAGT 

TTGATGTCTT 

GTATAACAAA 

TTAAAAAAGT 

AAGAAGTATC 

GATTGTATAA 

AGCATGAAAT 

GTAAGCAAAT 

TTAAGAAGCG 

TGTTAAATAC 

ATCATCAGTT 

TTAACCTGAA 

CGAGGTTAAT 

GTAATGAAAA 



AAT C AT AT AA 

AATTAAAAAC 

CAACAGAAAG 

CGTTAAAATA 

GACAGAAAGT 

GCTTGCTTGG 

TTTTAGTCAT 

AGACAAATAG 

AAT CAT C GG A 

AAGAGTTGGA 

TTAATAATAG 

AACTTACAAA 

AAGAAATTAA 

GGGAAGTTAA 

GAGCGGGTTT 

TTTTCGTGAC 

AAAAAATTAG 

TAACGTTGGT 

TAATGATAGT 

CAAATTTATA 

CATTGGCATT 

CAAAAGAAAA 

ACTCATCAGA 

CTAACGTTGT 

ATAAAGAGAA 

TTGATCCAAA 

CTGAAACTAA 

TTGACTCATT 

GAC AAC AT TT 

CTATCAATTT 

AGCGCATGGG 

AAATTTAAAG 

AAT GT TG AAT 

AGGCTTTTTT 

CTCTTGAAAG 

AAT T AAAC GA 

CAAATTACGT 

ACAGTGATAG 

GAAAATCAAT 

AACAAGTTAC 

AGTCAAGGTT 

GTGTTTTTAT 

TTTGCGGTAC 

AAGAAAAACG 

ATAGACGATA 

TCATTGAAGG 

GAAGGGTCAG 

ATTGATTTAA 

AT T AAAAACA 

AAGCGGTTTA 

TCGAACATTG 

AT TT AAAACT 

AATCAGTTAA 

TTAAATGAGC 

AAT T AC GT AA 

AACGATAGCT 

AAGTCAAACT 

TATGCTACAT 

TGAAGATTAC 

ATTTTTACTT 

TGCAGTACCT 

GAAAAATGTG 

TGATGCTAAA 

ATTGAAGGAA 

AGGAACGTCT 

TGATTTAAGT 

TAAAAATATT 

GCTTAACGAC 

ATTACTGTTG 

GTAGTAAACG 

CATTAAAATA 

TATCGTATGA 

CAATCCAATA 
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10741 TATTAAGATT GGAGCAAATA AATATGAAAT TTACAGCATT AGCAAAAGCG ACATTAGCTT 

10801 TAGGAATTTT AACAACAGGA ACTTTAACAA CAGAAGTTCA TTCAGGTCAT GCAAAACAAA 

108 61 ATCAAAAGTC AGTAAATAAA CATGACAAGG AAGC AT TATA CC GAT ACT AC ACTGGAAAGA 

10921 CTATGGAAAT GAAAAATATT AGTGCTTTGA AACATGGTAA AAACAACTTA CGTTTTAAGT 

10981 T TAG AGGT AT TAAGATTCAA GTTTTACTGC CTGGAAATGA TAAAAGTAAA TTTCAACAGC 

11041 GTAGTTATGA GGGGTTAGAT GTTTTCTTTG TTCAAGAAAA AAGAGATAAG CACGATATAT 

11101 TTTATACTGT TGGTGGTGTA ATACAGAATA ATAAAACATC TGGAGTTGTC AGTGCACCAA 

11161 TATTAAATAT TTCAAAAGAA AAGGGTGAAG ATGCTTTTGT GAAAGGTTAC CCTTATTACA 

11221 TTAAAAAAGA AAAAATAACA CTAAAAGAAC TGGATTATAA GTTGAGAAAG CATCTAATTG 

11281 AAAAATACGG ACTTTATAAA ACAATCTCAA AAGATGGTAG GGTCAAAATT AGCTTGAAAG 

11341 ATGGCAGTTT TTATAACCTT GATTTAAGAT CTAAATTAAA ATTTAAATAT ATGGGGGAAG 

114 01 TCATAGAAAG CAAACAAATT AAAGATATTG AAGTTAACTT AAAGTAAATC ATTACGAATA 

114 61 ATTAAAAGTA ATTGAAGCGG CTTAACGGTG AAATGTAAAT TGGTGCGCAT AGCTTATACA 

11521 AAAAGGATGC ATCAATCGAT ATCGTCGTTA AGCCGTTTTG GTTTGTGTGT CATGAATCCT 

11581 ATCCCAATCT CCATAAAGGT AAAATTTCCA CCACCAACAT CAAAATTCTC CACATCGCAA 

11641 CATAACCAAA TGTTATAATA AATCTATTAC ACAAAGAGAT AAATTACTTA TTCAAAGGCG 

117 01 GAGGAATCAC ATGTCTATTA CTGAAAAACA ACGTCAGCAA CAAGCTGAAT TACATAAAAA 

117 61 ATTATGGTCG ATTGCGAATG ATTTAAGAGG GAATATGGAT GCGAGTGAAT TCCGTAATTA 

11821 CATTTTAGGC TTGATTTTCT ATCGCTTCTT ATCTGAAAAA GCGGAACAAG AATATGCAGA 

11881 TGCCTTGTCA GGTGAAGACA TCACGTATCA AGAAGCATGG GCAGACGAAG AATACCGTGA 

11941 AGACTTAAAA GCAGAATTAA TTGACCAAGT CGGTTACTTC ATTGAGCCAG AAGATTTATT 

12001 CAGTGCGATG AT T C GTGAAA TTGAAACGCA AGATTTCGAT ATCGAACACC TGGCGACGGC 

120 61 AATTCGTAAA GTT GAAAC AT CAACATTAGG TGAAGAAAGT GAAAATGACT TTATCGGTCT 

12121 GTTCAGCGAT ATGGATTTGA GTTCAACGCG ACTAGGTAAC AATGTCAAAG AACGTACTGC 

12181 TTTAATCTCT AAAGTCATGG TTAATCTTGA CGACTTACCA TTCGTTCACA GTGACATGGA 

12241 AATTGATATG TTAGGTGATG CATATGAATT CCTAATTGGG CGCTTTGCGG CGACAGCGGG 

12301 TAAAAAAGCA GGCGAGTTCT ATACACCACA ACAAGTATCT AAGATACTGG CGAAGATTGT 

12361 CACAGACGGT AAAGATAAAT TACGTCACGT GTATGACCCA ACATGTGGTT CAGGTTCACT 

12421 GTTGTTACGT GTTGGTAAAG AAACACAAGT GTATCGTTAT TTCGGTCAAG AACGTAACAA 

12481 TACTACATAC AACTTAGCAC GCATGAATAT GTTATTACAT GATGTGCGTT ATGAGAACTT 

12541 CGATATCCGT AATGATGACA CATTGGAAAA CCCAGCCTTT TTAGGCAATA CATTTGATGC 

12601 GGTTATTGCG AACCCACCGT ATAGTGCGAA ATGGACTGCA GATTCAAAGT TTGAAAATGA 

12661 CGAACGATTC AGTGGTTACG GCAAACTTGC GCCTAAGTCT AAAGCAGACT TTGCCTTTAT 

12721 TCAACACATG GTACATTACC TAGACGATGA AGGTACCATG GCCGTTGTAC TCCCACATGG 

12781 TGTATTATTC CGAGGTGCTG CAGAAGGTGT CATTCGTCGT TATTTAATTG AAGAAAAGAA 

12841 CTACTTAGAA GC T GT G ATT G GTTTGCCAGC GAATATTTTC TATGGGACAA GTATTCCAAC 

12901 ATGTATTTTA GTATTTAAAA AATGTCGCCA AC AAG AC GAC AACGTACTAT TTATCGATGC 

12961 ATCCAATGAT TTTGAAAAAG GAAAAAATCA AAATCATTTA AGCGATGCCC AAGTCGAACG 

13021 TATTATAGAC ACATATAAGC GT AAG GAAAC AAT T GAT AAA TATAGCTACA GCGCGACACT 

13081 ACAAGAGATT GCCGATAACG ATT AC AAC C T AAATATACCG AGATATGTCG ATACATTCGA 

13141 AGAAGAAGCA CCGATTGATT TAGATCAAGT CCAACAAGAT TTGAAAAATA TCGATAAAGA 

13201 AATCGCAGAA ATTGAGCAAG AAATCAATGC ATACCTGAAA GAACTTGGGG TGTTGAAAGA 

13261 TGAGTAATAC ACAAAAGAAA AATGTGCCAG AATTGAGGTT CCCAGGGTTT GAAGGCGAAT 

13321 GGGAAGAGAA GCAGTTAGGG GATCTTACAG ATAGAGTAAT TAGGAAAAAT AAAAACTTAG 

13381 AATCGAAAAA GCCTTTAACA ATATCCGGAC AGTTAGGTTT AATTGATCAA ACAGAATATT 

13441 TTAGTAAATC AGTTTCGTCG AAAAATCTAG AAAAT T AT AC ACTAATAAAG AAT G GAG AAT 

13501 TCGCGTATAA CAAAAGTTAT TCTAATGGAT ACCCATTAGG GGCTATTAAA AGATTAACTA 

135 61 GATATGATAG TGGTGTATTG TCCTCTTTGT ATATTTGTTT TTCTATTAAA AGTGAAATGT 

13621 CTAAAGACTT CAT GG AAGC A TATTTTGATT CGACACACTG GTATAGAGAA GTTTCTGGAA 

13681 TTGCAGTTGA GGGTGCAAGA AATCACGGAT TATTAAATGT TTCTGTGAAT GATTTTTTTA 

13741 CTATTCTAAT TAAATATCCA AGTTTAGAAG AACAGCAAAA AATAGGCAAG TTCTTCAGCA 

13801 AACTCGACCG ACAAATTGAA TTAGAAGAAC AAAAGCTTGA ATTACTTCAA CAACAGAAAA 

13861 AAGGCTATAT GCAGAAAATT TTCTCACAGG AACTGCGATT CAAAGATGAG AATGGTGAAG 

13921 ATTATCCAGA TTGGGAAAAT AGCAAAATAG AAAAAT AT T T AAAAGAGAGA AACGAACGTT 

13981 CTGACAAAGG GCAAATGCTT TCAGTAACTA TAAATAGTGG CATTATAAAA TTTAGTGAAT 

14041 TGGATAGAAA AG AT AATT C A AGTAAAGATA AAAGTAATTA TAAAGTAGTT AGG AAAAAT G 

14101 AT AT T GC AT A TAATTCTATG AGAATGTGGC AAGGGGCTAG TGGTAAATCA AATTATAATG 

14161 GGATTGTTAG CCCTGCATAT ACTGTGCTTT ATCCAACACA AAATACTAGC TCATTATTTA 

14221 TTGGATATAA GTTTAAAACA CATAGAATGA TTCATAAATT T AAAAT T AAT TC AC AAG GAT 

14281 TAACATCAGA TACATGGAAC TT AAAAT AT A AACAATTAAA AAATATAAAT ATAGATATAC 

14341 CTGTATTGGA GGAACAAGAA AAGATAGGTG ATTTCTTTAA AAAAATGGAT AT ATT GAT AA 

14401 GTAAACAGAA AATGAAAATT GAAATATTAG AAAAAGAGAA ACAATCCTTT TTACAAAAAA 

144 61 TGTTCTTATA ACTTTGATAA ATACATAGAT TGCATAAGAA TAAAATTTGT ATAATTTAAC 

14521 ATAAAAGTTG TAAAAGTAAA GTGAATTAAA AACGAACATT AAATTTAGGC ACTGTGAAAG 

14581 CGCAGTGTCT TTTTTGTGTC GAAATTGTGT ACAGAATAAG TAGTTAAATA AAGATTAAGT 

14 641 TGAGATAAAG TGTTATTCGT AAATAAAAGA GAGTAGATCG ATAGGAATTG AAT GAT AT T A 

14701 GTTAACTATT TATTAAATTA CTTAATAATG ATTAATTTTT AGTTAAAGTA AGTTTAATGT 

14 761 GAAGCACGAC CATTGCTCAT TAT AAT GAAT GAGGATTGTT CGTATTGCGT AATAGAATAA 

14821 ATCAAATAGA CTAAAAATTG GGAGCATAGA ATTATGAAAT TAAAAAATAT TGCTAAAGCA 

14881 AGTTTAGCAC TAGGGATTTT AACAACAGGG ATGATTACAA CTACTGCTCA GCCAGTAAAA 

14941 GCAAGTACAT TAGAGGTTAG ATCACAAGCT ACTCAAGACT TGAGTGAATA TTATAATAGA 

15001 CCGTTCTTTG AGTATACAAA TCAGTCAGGA TATAAAGAGG AAGGAAAAGT GACGTTTACT 

15061 CCTAATTATC AACTTATAGA TGTAACTTTA ACTGGGAATG AAAAGCAAAA TTTTGGTGAA 
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15121 
15181 
15241 
15301 
15361 
15421 
15481 
15541 
15601 
15661 



GATATTTCTA 
GCTTCAATTG 
GTAAATTTAA 
TATACAATTA 
TTAATTGATA 
ATGAAAGATG 
GGTGATGTTA 
AGGGAGCATA 
GTTGCAATTA 
AACGCTTTAA 



ATGTAGATAT 
GTGGTATTAC 
TAATTACTAA 
ATAAAGAAGA 
AACATAACCT 
GTGGGTTCTA 
TTGATGGCAG 
TCATGAGGGA 
C AATGT TATA 
TCTCTAATAT 



ATTTGTTGTA 
TAAAACAAAC 
AAACATCGAT 
AATTTCATTA 
TTATAAGACA 
CACATTTGAA 
AAATATAGAA 
AAATTTTAAG 
TATTATGACA 
AAATGTAGAC 



AG AG AAAAT T 
GGTTCAAATT 
AGTGTTACAT 
AAAGAACTTG 
GAACCTAAAG 
TTGAATAAAA 
AAAATTGAAG 
TTACGTAAAA 
AACGGACAAG 
AATCAG 



CTGATAGATC 
AT AT TG AT AA 
CAACGTCAAC 
ATTTTAAATT 
ACAGTAAAAT 
AGTTACAAAC 
TGAATTTATA 
TGAAAGTCGG 
CAGAAGCATC 



TGGTAATACA 
AGTAAAAGAT 
AT CATC TAG A 
AAGAAAGCAT 
TCGAATTACT 
ACACCGTATG 
AAATTATTCG 
TTTAGTATCT 
TGAAAATCAA 



SEQ ID NO. 105 

S. aureus strain EMRSA 16(252) (SSL1-SSL11) -coding strand 



1 TGCAATGGCT TGATTTGTAA TATGTGTGCC TAAATGACCT GTAGCACCTG TTAACATAAT 

61 ATTCATTCAC TTCATCTCCT AATCTTTATA TACATAACAT AAT ACT T AT T CG AT GAT T TT 

121 CAAAACATTT GATTTTATAA AAAATTGCAA TCTGTATTTA TTGTCGCCGT GTATAGTAAA 

181 TACGCATATG TTATTAATGT TAAAAATACC GTAATGACGC GTTTTAGTTG ATGCGTGTCA 

241 CCATGATCTT TGAAAATTTT GACATGGTAC TGCGACGATA TGATGTCATT TTTGTGTCTG 

301 AAAGTTTTAC AGTTTTTAAA ATTAAAATGG TTTAAAGTGT GAATTGTATA AAAAAGAGTC 

361 TTGACGGATA AGAATTGATT ATTAACAGTT AGCATTTTAT TAATTACCTT AACAATGATT 

421 CAAGTTTAGT TAAATGAGGT TTAATTTGAA AGGGAATAGC GCCTCAATAT AATGTAGGTA 

4 81 GATTGTTCAT ATTACGTAAT T GAAAAAT C A AATTTAAATA GATTGGGGCT AAAAATTATG 

541 AAATTTAAAG CGATAGCAAA AGCAAGTTTA GCATTGGGAA TGTTAGCAAC AGGTGTAATA 

601 ACGTCGAATA TACAATCAGT ACAAGCGAAA ACAGAAGTTA AACAACAAAG TGAGGCTGAT 

661 TTAAAACTTT ATT AT AAT GG ACCAAGTTTT GAATATAAAA AAGTAACTGG AT AT G GATTT 

721 ATTGAAGGTA AAGATAGATT CAT T GAT TTT ATATACAATG GACAATATAA TAAAATATCT 

781 TTAGTTGGTT CTGATAAAGA TAAATATAAT GAAGAAGTTA ACCCAGATAT AGATGTGTTT 

841 GTCGTTAGAG AAGGAAACGG TAGACAAGCT GATAATCATT CGATTGGTGG CGTAACAAAA 

901 ACTAATAGAG GAGTGTATTA CGACTATATA CACTCTCCAA TCCTTGAAAT TAAGAAAGGT 

961 AATGAAGAAC CACAAAATAG TCTGTATCAA ATTTATAAAG AAAAT ATCTC ATTAAAAGAA 

1021 CTTGATTATA GATTACGAGA ACGTGCAATC AAACAACACG GATTGTATTC AAATGGTCTT 

1081 AAACAAGGTC AAATTACAAT TACTATGAAA GATGGCAAAT CACATACTAT CGATTTAAGT 

1141 CAAAAACTTG AAAAAGAACG TATGGGTGAT TCTATCGACG GCAGACAAAT ACAAAAAATT 

1201 CTAGTAGAAA TGAAATAATA CTTTCTAACG ACAAAGCGCT ATGTTGAATC GTGCTTGTTA 

1261 TGGAAATATA TGGAAGTTAA GCGACGTACT GTTGCTTAGC TTCTTTTTTG AGGGCAAAGT 

1321 TACAAAACTC ACACAAACAG TCGCAGCACG CATTATCTTT TACTTAACTA GCTTAATCAT 

1381 ATTTTATGAA TAG T T AAAAA AGGGTTAATG TGAAT AT CAG CATACAGCTC CTATAATATG 

1441 GTTGTATGAT TCAATTTACG TAATAAAACA ATCTAATCTA ATATATTGGA GCATACAATT 

1501 ATGAAAATGA AATCAATTGC AAAAGTAAGT TTAGTGTTGG GTATTTTAGC TACAGGTGTA 

1561 AATACTGTAA CGGAACAACC GGTGCATGCC GAAAATAAAC ATGTTCAAGT AAGCC AAAAT 

1621 AGTAAAAATT TAAAAGCGTA CTATACTCAA CCTAGTGTTG AATATAAAAA TGTGACAGGT 

1681 TATATCAGTA GTATTCAACC TAAACCAGGC ACTAAATTTT TGAAT AT GAT AGAAGGTAAT 

1741 ACAGTTAATA ATCTAGCTTT AGTTGGCAAA GATAAGGAAC ACTATCATAC GGGTGTACAT 

1801 CGCAACCTTG ATATATTTTA TGTTAATGAG GATAAGAGAT TTGAAGGTGC AAAGTACTCG 

1861 ATTGGTGGTA TCACAAAAGC AAGCGATAAA GTTGTCGACC AAGTAGCAGA AGCAAGAGTC 

1921 ATTAAAGAAG ATCACACTGG TGAAT AT GAT TATGACTTTT TCCCATTTAA AATAGATAAA 

1981 GAAGCAATGA CATTAAAAGA GGTTGATTTT AAAATAAGAA AAC AT C TT AT T GAT CAT TAT 

2041 GGTCTTTATG GTGAAAT GAG TTCAGGAACA GTTACTGTCC AAATGAAATA CTATGGCAAG 

2101 TATACAATTG AATTGGATAA AAAGTTACAA GAAGACCGAA TGGCCGATAT CGTCAAAGTC 

2161 AT AG AT ATT G AT AG GAT TG A AGTCAAAGTT AAAAAAGCAT AATGCTTAGA CTGGTCGTCG 

2221 TAAT GAATTG AAATTGCAAT AGTGAGGTTA AGTGATGATA AAACGTTACT TAACTTCTTT 

2281 TTTATGTCTT AAAATCATTT CAAAGATACA TAGTAACACC ACATAAACCT CGAAAGTACT 

2341 CATTATTTTT TGCTTAAATG ACT TAAT AAT ATTTCAACAA TTGTTAAACG AGGGTTAATG 

2401 TGAAT AGACC AATATGTCCT CTATAATGAA GTTGTATGAT TCTAATTACG TAAAGAACAA 

2461 TCGAATAATT ACGATTGGAG CATACAACTA TGAAAATAAG AACAATTGCG AAAGCGAGTA 

2521 TAGCGTTAGG GCTTGTAACA ACAGGTGCAA GTATAGTAAC AACGCAATCG GCCAATGCAG 

2581 AAGTAGCATC AGCACTTAAA GCAGGACAGT TAGCAAAGAT AAACGTATCA ACAAGCGCAA 

2641 TCACAACGAC AGCGCAAGCA GT G AAC GC AG AACAAAATCA TACGGCTAAT CCTGAACAGG 

2701 CGGCAAAGTC TAACACAGAA AATGT AT C AA CATCGCTTTC AACTCAAACA GAACAGACGC 

2761 CAAAATCTAA CATACAAAAA GCAACACAAC CAGCACCAAG CGGAACAACT CAGTCTTTAC 

2821 CAAATGCACA ACCGCAATCA ACACAACCAA CACCAAGTGT AACAACACCG CCTTCATCTA 

2881 ATGTCGAAAC ACCACAACCA ACATCGCCAA CCACAAAACA AGCACAAAAG GAAATAAACC 

2941 CTAAATTTAA AGGTTTAAGA TC AT AT TATA CGAAATCAAG TTTAGAATTT AAAAATGAGT 

3001 TGGGTATAAT TATCAAAAAA TGGACAACAA TAAGATTTAT GAATATTGTT C CAG ATT ATT 

3061 TCATATATAA AATAGCTTTA GTTGGAAAAG ATGATAAGAA ATATGGTGAA GGCGTGCATA 

3121 GGCATGTCGA TGTATTTATC GTTTTAGAAC AAAATAAATA TGGCGTAGAC AAATACTCGG 

3181 TCGGTGGTAT C AC AAAAG C A AATAGAAAGA AAGTTGATTA CAAAACTGGA ATAAGTATTA 

3241 CTAAAGAAGA TAAAAAAGGT ACAATCTCAC ATGATGTTTC AGAATACAAG ATTACTAAAG 

3301 AAGAGATTTC CTTGAAAGAA CTTGATTTTA AATTGAGAAA ACAACTCATT GAACAACATA 

3361 ATTTGTACGG TAATATTGGT TCAGGAACAA TCGTTATTAA AAT GAAAAAT GGTGGAAAGT 
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3421 
3481 
3541 
3601 
3661 
3721 
3781 
3841 
3901 
3961 
4021 
4081 
4141 
4201 
4261 
4321 
4381 
4441 
4501 
4561 
4621 
4681 
4741 
4801 
4861 
4921 
4981 
5041 
5101 
5161 
5221 
5281 
5341 
5401 
5461 
5521 
5581 
5641 
5701 
5761 
5821 
5881 
5941 
6001 
6061 
6121 
6181 
6241 
6301 
6361 
6421 
6481 
6541 
6601 
6661 
6721 
6781 
6841 
6901 
6961 
7021 
7081 
7141 
7201 
7261 
7321 
7381 
7441 
7501 
7561 
7621 
7681 
7741 



ACACGTTTGA 
CAAATATTGA 
TTGAACGATG 
TTTTGCGTAA 
CTTGAAAATG 
TAACGAAAAT 
AATAGACCGA 
GAATATAATA 
GCACTATCAC 
GTAGCATCAG 
TCGCCTTCAC 
CCTTCACCAA 
TCATCTAATG 
ATAAACCCTA 
AAGCAATTTG 
GATTGGTTCA 
C CAT AC GAT A 
TATTCTGTCG 
AGCATTACTA 
ACTAAAGAAG 
CAACATAATT 
GGGAAGTACA 
GATGGCACTA 
AGAACCGTTG 
CTTTCGTGTT 
CTTATGTCCT 
CATTTTTATT 
TTAAACAAGG 
ATAAACGTAA 
GCAATTGCGA 
GCTCAAACTG 
TTCGACTTGA 
CGTTATAGCA 
ATACAAATTT 
ATATTTGTAG 
ACTAAGAAAA 
AAAGAAGTAG 
ATTTCGCTGA 
T AT AAAAAG T 
ACGTTTGAAC 
AAT AT T GAGA 
TATGGATAGT 
CG AT C G G AAA 
TTGATACCTA 
T AAAT T G AAA 
TACTGTTAGG 
ATTCATAGCG 
AT T AAT C AC A 
TG AAAT T AAA 
TT AC AT C AG A 
ATGATATTAG 
GTGGTAAGGT 
ATCACCAATT 
AAAATGTCTT 
GTGTAACGAA 
TTAATGGTGA 
CATTAAAAGA 
AAGGTACATC 
TTGATTTAGG 
TTAGAGGTAT 
AATAAATTTT 
GAGCGTATTT 
GAGTATACTA 
GTTCATTAAC 
ACTTTATAAT 
TTGGAGCAAT 
TTAACAACAG 
GAAACACAAC 
GAATCAACAA 
AACCAACGAA 
GAACTGACGC 
ATATATAGCG 
CATCCGGGAC 



AT T AC AC AAA 

TAGAATTGAA 

AAATGAGAAG 

TGATGTTAAG 

ATGCAATGCC 

TAT GG ATT AC 

TACGTCTTCT 

GATTGGAGCA 

TTTTAACAAC 

CACTTAAAAC 

C AAAT G T AC A 

ATGTACAACC 

TCGAAACAAA 

AATATAAGGA 

GCTTTATGCT 

TTTATAAAAT 

ATATTGATGT 

GGGGTATCAC 

AAAAAG AT G A 

AGATTTCCTT 

TGTACGGTAA 

CGTTTGAATT 

GTATCGAAAG 

AACGATGTAC 

GCACAGTGGT 

AAATGTGTCT 

TAACGAACAT 

TTTAATATGA 

TCGAACAAAT 

AAGCCAGTTT 

TAAATGCGAG 

GAGATTACTA 

AAGGTGGCAA 

TTGGTAAAGA 

TTAAAGAAGC 

ATCAAGGCGC 

GTGCAGGTGT 

AAGAACTTGA 

TTCCTAAAGC 

TCAATAAAAA 

AAATAGAAGC 

ATAGATGAGC 

TGAAGTGGGT 

TAGTCGCACG 

GAAAGTGTGA 

ATTCCATTAA 

CAGTATCCGG 

TAAACATATA 

AACGTTAGCT 

AGGTCAAGCA 

AGATTTACAT 

TGAAAACTAC 

ATTCTTATTA 

TGTAGTACAA 

GAAAAACAAC 

AGATTTAGAT 

GCTTGATTTC 

TAAATACGGT 

TGATAAATTA 

ATCAGTCACT 

GAGGCAGCTT 

AAAGCAACGT 

AATAAGTCAT 

AATGATTCAA 

AAAGGCAGAT 

AAATTAT GAA 

GTGTGATGAC 

GCAAATATTA 

ACATTAGTGT 

ATAAAAATTT 

ATGGCCGTGA 

TTGGCGGTAT 

TGCAAGTTAA 



AAATTGCAAC 

GTGAATCTAA 

TTAAGCGACA 

G G AAAT T ATT 

CAATGTGCAA 

TTAACGATGA 

ATAATGTAGT 

TACAACTATG 

AGGTGCAAGT 

AGAACAGGCA 

ACCGCAATCG 

GCAATCGCCA 

ACAACCGCAA 

TTTAAGAACA 

GAAACCATGG 

AGCTTTAGTT 

ATTTATCGTT 

GAAGACAAAT 

GAAAGGCAAA 

GAAAGAGCTT 

TATTGGTTCA 

GCATAAAAAA 

AATTGAAGTG 

CGAGAAGTTG 

ATAAGAAGAC 

AAAAAT AC C A 

TATGGATTCC 

ATAGAGACAT 

CTAATAATTA 

AGCTCTAAGT 

CGAACATGAA 

TAGTCGCGCA 

GCACTACCTT 

TATAGAAAGG 

GGAAAATCGC 

TT ACT AT GAT 

TTCTGTGCAT 

TTTTAAATTA 

TAGCAAGATA 

GTTACAAACA 

CAATATTAGA 

TAGGCAGCAT 

CGATGAAAGA 

GTGTCGCACT 

ATGGTGAAAA 

TTAGCTTAAC 

CT TAT AT AAT 

TTAAGTCAAA 

AAAGCAACAT 

GTTCAAGCGG 

CGATACTACT 

AATGGTTCTA 

GGAAAAGATA 

GAATTAATTG 

AAAACTTCTG 

GCATCAATTG 

AAAAT TAG AC 

AAAATCATTA 

CAATTCGAGC 

ATTAACCAAA 

AAC GAT G AAA 

CGTCGTTAGG 

ACCACTCATT 

AGTTATTTAA 

GTTTCAAATT 

ATTAACAGCA 

AGCAGAAAGT 

TAT AAAT AT G 

TAAAAGTGAA 

CAAAGTATTT 

TGTCTTTGCA 

AACAAAGAAA 

AAAAGTTGAT 



AACATCGCAT 

AATCATCGTG 

AT GAC CT AC A 

CACTTGTTTG 

GCGTTGAATC 

TTCAAATATA 

TGTATGATTC 

AAAATGACAC 

ATAGTAACAA 

ATAAAATCTA 

CCACAACCAA 

CAACCAACAC 

TCGCCAACCA 

TATTATACGA 

ACGACAGTTA 

GGAAAAGATG 

TTAGAGGACA 

AGTAAAAAAG 

ATCTCACACG 

GATTTTAAAT 

GGAACAATTG 

TTACAAGAAC 

AAT C T AAAAT 

AGTGACAAAA 

GAATATTTGA 

ATCTATCAAG 

TTAATTTAAT 

ACGCCAACTA 

CGATTGGAGC 

ATTTTAGCGA 

TC AAAAT AT G 

AGTAAGGAAC 

ATCTTTGATA 

ATTAAAAAAC 

AACGGCACAG 

TACTTAAGCG 

GTTAAAAGAT 

CGACAGTATT 

AAAGTGACAA 

AACCGAATGA 

TAATTCAACG 

AGGTTGCTTA 

T AAT AAC GAC 

AGTGACATGA 

AAAT AG CAT A 

ATTGGTTCAG 

GATAGTAGAT 

AT T T AAAAAT 

TAGCATTAGG 

CAGAAAAACA 

CATCAGAAAG 

ACGTTGTACG 

AAGAACAATA 

ATCCAAACGG 

AAACTAATAC 

ACTCATTTTT 

AACAATTAGT 

TCAATTTGAA 

GCATGGGCGA 

TTTAAAGTAA 

CGTTGAATAA 

CTGTTTTTTA 

AAAACCACGT 

ACGCACGTTA 

ACGTAATCAT 

ATAGCTAAAG 

CAAACTGTAA 

CTAAAAGATT 

GATTATTATG 

CTAATTGGCG 

GTACCTGAAT 

AAT GT GAG AT 

CCTAAAGATG 



GGCAGATGTT 

ATGGATACTA 

TCATTGCTTA 

TAAAAGTGAC 

GCTTCGAAAC 

GTTAAACATG 

TAATTACGTA 

AAATTGCGAA 

CGCAATCGGC 

AAATACAAAA 

CACCAAGTGT 

CAAACCCAAC 

CAAAACAAGC 

AACCGAGTTT 

GGTTTATGAA 

ATAAGAAATA 

ATAAATATCA 

TTGATCACAA 

ATGATTCAGA 

TGAGAAAGCA 

TTATTAAAAC 

ATCGCATGGC 

CATCGTGATG 

ATTTACATGT 

TGTTGACCAC 

CGTTGAAGTA 

TAACGATGAT 

TAATAAAGCT 

ATACAACTAT 

CTGGGGTTAT 

AAAATGT G AC 

TTAAAAATGT 

AAAATAGAAA 

GTAAAAATCC 

TGTATTCATA 

CACCTAGATT 

ACTACATTTA 

TAATTCAAGA 

TGAAAGATGG 

GTGACGTCAT 

AAACATGGAT 

GCTTCTTTTT 

AATAAAAACT 

AACAATGTGG 

GAGTTATAAA 

GAATAGTTAA 

TGTTCGTATT 

AGATTGGGAG 

TTTATTAACT 

AGAGAGAGTA 

TTTCGAATAT 

CTTTAACCCA 

TAAAGAAGGT 

CAGACTATCT 

ACCTTTATTT 

AATCCAAAAA 

T AAT AAT T AC 

AG AC G AAAAT 

TGTGTTGAAT 

GCAATCAATG 

AT AT GT AT AT 

CATTTGATAA 

ACTTTGAATA 

ATGTCAGTTT 

AACAATCCAA 

CAACATTAGC 

ACGCGAAAGT 

ACTATTCTCA 

GGTCTAACGT 

ACGATAGAAA 

TAATAGATAC 

CCGTGTTTGG 

GCTTTTCGAT 



ATTGATGGTA 

CATAGAACCG 

GCTTCTTTTG 

ATCTCAATGT 

CATCTGTATT 

GTTTAATGTG 

AAGAACAATC 

AACCAGTTTA 

TAATGCAGAA 

AGTAACAACA 

AACAACTCCG 

AACACCGCAT 

ACAAAAGGAA 

AGAATTCGAA 

TGTTATACCA 

TAAAGATGGG 

ATTGAAAAAA 

AGCAGAATTA 

ATATAAGATA 

ACTCATTGAA 

GAAAAACGGT 

AGATGTCATA 

GATACTGCAT 

TGCTCAGCTT 

ACATAACATG 

CAACAAAAGT 

T C AAAT AT AG 

GTATGATTCA 

GAAGATGACA 

AACATCAACG 

AAAAG AT AT C 

TACTGGCTAT 

GTTCACAAGA 

AGGATTAGAT 

TGGTGGCGTT 

CGTTATTAAA 

TAAGGAAGAG 

TTTTGATCTG 

CGGCTATTAT 

AGATGGCAGA 

AATAGTAAAA 

TGTGTCGTTA 

GT G AAAAT AG 

AAAA CAT AAT 

AAATGATTAA 

AAAGCGGTTA 

ACGTAATTGA 

AAT AAT AC T A 

ACTGGTGTCA 

CAACATTTAC 

AGTAATGTTA 

AAAG AT C AAA 

CTACAAGGCC 

ACTGTTGGTG 

GTTAATAAAG 

GAAGAAAT CT 

GGAT TAT AT A 

AAAG TAG AAA 

AGTAAAGACA 

ACTTTAAAGT 

CCTATCAAAG 

CACAAGTCAC 

TTAATTAGGT 

GTTTCGATGC 

TACATTAAGA 

ATTAGGAATA 

AAAGTTGGAT 

AGAAAGCTAT 

TTTAAACTTT 

TAAATATAAA 

TAAAGGCGGC 

CTATGTAAGT 

AAAAGAGTTG 
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7801 TTCTTTATTC AAAAAGAAGA 

7861 TTAGTCGAAA AATATAGATT 

7921 ATGAAAGACG AAAAGAAACA 

7981 TTTGATGTGC TGGATAGTAA 

5 8041 ATGATAGTAT AAAAAATTAA 

8101 TTTGAATAAG AACTGTGTTA 

8161 AGTAGAAATA ACACAATTAG 

8221 GCGTGAAAGT TAATTAACCT 

8281 CGCATATTTA AATCGAGGTT 

10 8341 TGATTCAAAT TACGTAATAA 

8401 AATTGACAGC ATTAGCAAAA 

8461 CAACAGAGGC CCATTCAGGT 

8521 AAGAAGCATT ACACCGATAC 

8581 TGAGACATGG TAAAAATAAC 

15 8 641 TGCCTGGAGA TGAGTACCGT 

8701 TTGTTCAAGA AAGAAGAGAC 

8761 CCAATAAAAC ATCCGGCTTT 

8821 AAGATGCTTT TGTGAAAGGT 

8881 AGTTAGATTT TAAGTTGAGA 

20 8941 CAAAAGATGG T AGG AT T AAA 

9001 GAACTAAATT AAAATTCAAA 

9061 TTGAAGTGAA TTTAAAGTAA 

9121 ATGAAAAGTA AATTGGTGCG 

9181 AAGCCGTTTT TGTTTGCGTG 

25 9241 ACCATCAACA TCAAAATTCT 

9301 CACAAAGAGA CAAATTACTT 

9361 AACGTCAGCA ACAAGCTGAA 

9421 GGAATATGGA TGCGAGTGAA 

9481 TATCTGAAAA AGCGGAACAA 

30 9541 AAGAAGCATG GGCAGATGAA 

9 601 TTGGTTACTT CATTGAGCCA 

9661 AAGATTTCGA TATCGAACAT 

9721 GTGAAGAAAG TGAAAATGAC 

9781 GTTTAGGTAA CAATGTCAAA 

35 9841 ATGACTTACC ATTCGTTCAC 

9901 TTCTTATCGG GCGCTTTGCG 

9961 AACAAGTATC TAAGATACTG 

10021 TATATGACCC AACATGTGGT 

10081 TGTATCGTTA TTTTGGACAA 

40 10141 TGTTGTTACA TGATGTGCGA 

10201 ATCCAGCCTT TTTAGGACAT 

102 61 AATGGACAGC AGACTCAAAA 

10321 CGCCAAAGTC TAAAGCAGAC 

10381 AAGGTACCAT GGCCGTTGTA 

45 104 41 TCATTCGTCG TTATTTAATT 

10501 CGAACATTTT CTATGGGACA 

105 61 AACAAGAC GA CAACGTATTA 

10621 AAAACCATTT AAGCGATGCC 

10681 CAATTGATAA GTATAGCTAC 

50 10741 TAAACATACC GAGATATGTC 

10801 TCCAACAAGA TTTGAAAAAT 

108 61 CATACCTGAA AGAACTTGGG 

10921 GAG TT GAG AT TCCCAGGGTT 

10981 GAATTTAAAA ATGGTTTAAA 

55 11041 AACTTCAAAG ATGTATTTAA 

11101 AATGTGAATA GCAAAGAACT 

11161 AGGACTAGTG AGGTAATTGG 

11221 AATACTGTGT TTAGTGGATT 

11281 AATAATAATT TTAAAAGATA 

60 11341 AAAAGTTCTA TGACAACTAG 

11401 ATATACCCTG TTTCGGCTAA 

114 61 CGACAAATTG AATTAGAAGA 

11521 AT G C AG AAAA TCTTCTCACA 

11581 CATTGGGAAA AT AG C AAAAT 

65 11641 GGTCAAATGC TTTCAGTAAC 

11701 AAAGATAATT CAAGTAAAGA 

11761 TATAATTCTA TGAGAATGTG 

11821 AGCCCTGCAT ATACTGTGCT 

11881 AAGTTTAAAA CACATAGAAT 

70 11941 GATACATGGA ACTTAAAATA 

12001 GAGGAACAAG AAAAGATAGG 

12061 AAAATAAAAA TTGAAATATT 

12121 TAACTTTGAT AAACACATAG 



SDOCID: <WO 20050929 1BA2J_> 



AGTATCATTG AAAGAACTTG ATTTTAAAAT CAGAAAAATG 
GTATAAAGGC GCGTCAGATA AAGGTAGAAT TGTTATCAAT 
TGAAATTGAT TTAAGTGAAA AATTAAGTTT TGATCGTATG 
GC AG ATT AAA AATATTGAAG TGAATTTGAA TTAATTAAAG 
AAAGCGGCTT AACGATAAAA TGTGCATTGA CACCCGTACC 
AATACATTAT TGTTGTTAAG TTGTTTTTTG CGTTTCAAAG 
TTGTAGAAAA CGATGATCCG GAAAAACAAC GACATGTAAA 
GGACATTAAA ATATATTTGT TTTTAATTAA TAATAATTCA 
AATTATCGCG TTAAACGATG GACGTTATAA TAAGCGTATA 
TAACAATCCA ATACATTAAG ATTGGAGTAA ATAAATATGA 
GTAACATTAG CATTAGGGAT TTTAACAACA GGAACTTTAA 
CATGCTAAAC- AAAAT C AAAA GTCAGTAAAC AAAC AT GAC A 
TACACTGGAA ACTTTAAGGA AATGAAAAAC ATTAATGCTT 
TTACGTTTTA AATATAGAGG GAT GAAGACT CAAGTATTAT 
AAATATCAAC AGCGAAGACA TACGGGCTTA GATGTGTTTT 
AAGCACGACA TATCATATAC TGTTGGTGGT GTAACAAAGA 
GTCAGCACAC CAAGATTAAA TGTTACAAAA GAAAAGGGTG 
TACCCTTATG ATATTAAAAA AGAAGAAATA TCATTGAAAG 
AAGCATCTAA TTGAAAAATA CGGTCTTTAT AAAACACTCT 
ATTAGTCTGA AAGATGGTAG CTTTTATAAC CTTGATTTAA 
CATATGGGGG AAGTCATAGA TAGTAAACAA ATTAAGGATA 
AT C ATT ATG A ATAATAAAAA GTAAT TGAAG CGGCTTAACG 
TATACCTTAC C AAAAGG AT G CATCAATCGA TATCGTCGTT 
TTATGAATCC TATCCCAATC TCCATGAATA TAAAATTTCC 
CAACATCGCG ACACCACCAA ATGTTATAAT AAATC T ATT A 
ATGCAAAGGC GGAGGAATCA CATGTCTATT ACTGAAAAAC 
TTACATAAAA AATTATGGTC AATTGCGAAT GATTTAAGAG 
TTCCGTAATT ACATTTTAGG CTTGATTTTC TATCGCTTCT 
GAATATGCAG ATGCCTTGGC AGGTGAAGAT ATTACATATC 
GAATACCGTG AAGACTTAAA AGCAGAATTA ATTGATCAAG 
CAAGATTTAT TCAGTGCGAT GATTCGTGAA ATTGAAACGC 
CTGGCGACGG CAATTCGTAA AGTTGAAACT TCAACACTAG 
TTTATCGGAC TGTTCAGCGA TATGGACTTA AGTTCAACGC 
GAACGTACTG CGTTAATTTC CAAAGTTATG GTTAACCTTG 
AGTGATATGG AAATT G AT AT GTTAGGTGAT GCATACGAAT 
GCGACAGCGG GTAAAAAAGC AGGCGAGTTC TATACACCAC 
GCGAAGATTG TCACAGACGG TAAAGATAAA TTACGTCATG 
TCAGGTTCAT TACTGTTACG CGTTGGTAAA GAGGCAAAAG 
GAAC GTAACA ATACCACATA CAACTTAGCG CGCATGAACA 
TATGAAAATT TCGATATCCG TAATG AT GAT ACGTTGGAAA 
ACATTTGATG CGGTTATTGC GAACCCACCA TATAGCGCGA 
TTTGAAAATG ACGAACGCTT CAGCGGATAC GGCAAGCTTG 
TTTGCCTTTA TTCAACACAT GGTACATTAC CTAGATGATG 
CTCCCACATG GTGTCTTATT CCGTGGTGCC GCAGAAGGTA 
GAAGAAAAGA ACTACTTAGA AGCCGTGATT GGTTTGCCAG 
AGTATTCCAA CATGTATCTT AGT AT TT AAA AAATGTCGCC 
TTTATCGATG CATCCAATGA TTTTGAAAAA GGAAAAAACC 
CAAGTCGAAC GAATTATAGA CACATATAAG CGTAAGGAAA 
AGCGCGACAT TACAAGAGAT CGCCGATAAC GATTACAACT 
GATACATTCG AAGAAGAAGC GCCAATTGAT TTAGATCAAG 
ATCGACAAAG AAATCGCAGA AATTGAACAA GAAATCAATG 
GTGTTGAAAG ATGAGTAATA CACAAACGAA AAATGTGCCA 
TGAAGGCGAA TGGGAAGAGA AGAAGGTTGG CGAGTTATTA 
TAAAGGAAAA GAATATTTTG GCTCAGGATC GTCGATTGTT 
TAACAGGAGC TTAAATACAA ATAATCTGAC TGGAAAAGTT 
AAAAAATTAT TCTGTTGAAA AGGGTGATGT TTTTTTTACA 
TGAAATAGGT TATCCGTCTG TAATTTTAAA TGACCCTGAA 
TGTATTAAGA GGGCGGCCTA AATCAGGAAT TGATTTAATA 
TGTCTTTTTT ACTAATTCAT TTAGAAAAGA AATGATTACA 
AGCTTTAACA TCAGGTAGCG CAATTAATAA AATGAAGGTC 
AGAACAGAGA AAAATAGGTG ACTTCTTCAG CAAACTCGAC 
ACAAAAGCTT GAATTACTTC AACAACAAAA AAAAGGCTAT 
GGAACTGCGA TTCAAAGATG AGAATAGTGA AGATTATCCA 
AGAAAAATAT TTAAAAGAGA GAAACGAACG TTCTGACAAA 
T ATAAAT AG T GGCATTATAA AATTTAGTGA AT TG GAT AGA 
TAAAAGTAAT TATAAAGTAG TT AG G AAAAA TGATATTGCA 
GCAAGGGGCT AGTGGTAGAT CAAATTATAA TGGGATTGTT 
TTATCCAACA CAAAATACTA GCTCATTATT TATTGGATAT 
GATTCATAAA TTTAAAATTA ATTCACAAGG ATTAACATCA 
TAAACAATTA AAAAATATAA ATATAGATAT ACCTGTATTG 
TGATTTCTTT AAAAAAATGG ATATATTGAT TAGTAAACAG 
AGAAAAAGAG AAACAATCCT TTTTACAAAA GATGTTCTTA 
GTTGCATAAG AATAAAATTT GTGTAATTTA ACATAAAAGT 
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12181 TATAAACATA AAGTAAATTA AAAACGAACA TTAAATTTAG GCACTGTGAT AGCACAGTGT 

122 41 CTTTTTTGTG TCGAAATTGT GTACAGAATA AGTAGTTAAA TAAAGATTAA GTTGAGATAA 

12301 AGTGTTATTC GTAAATAAAA GAGAGTAGAT CGATAGGAAT TGAAT GAT AT TAGTTAACTA 

12361 TTTATTAAAT CACTTAATAA TGATTAATTT TTAGTTAAAG TAAGTTTAAT GTGAAGCACG 

12421 ACCGTTGCTC ATTATAATGA ATGAGGATTG TTCGTATTGC GTAATAGAAT AAATCAAATA 

12461 GACTAAAAAT TGGGAGCATA GAATTATGAA ATTAAAAAAT ATTGCTAAAG CAAGTTTAGC 

12541 ACTAGGGATT TTGACAACAG GGATGATTAC AACTACTGCT CAGCCAGTAA AAGCAAGTGA 

12601 GCAAAGCAGA TTATCAGTTA CTTCAAACGA CACGCAAGAA TTAAAAAAAT ACTACAGTGG 

12661 AACAGGATAT AATTTTCAAA ATGTGAGTGG TTATAGAGAA AAGGATAAAA TGAACATTAT 

12721 TGATGGGACA CAACTTAATG TAGTTACTTT ACTTGGTACA GACAAAGAAA GATTTAAAGA 

12781 TTATGACTAT GATTATGAAG GGTTAGATGT CTTTGTAGTC AGAGAAGGAT CAGGTAAACA 

12841 AGCTGAAAAT AT TTCAATAG GTGGAATTAC CAAGACGAAT AAAAAC GAT T ATAAAGATTT 

12901 CGTAAATAAT GTAGGTCTAG AAATAACTAA ACCAACAGGA CATAATACAG CAACAAGACA 

12961 AGCAGAAACT TATAGAATTA ATAAAGAAGA AATTTCATTA AAAGAATTAG ATTTCAAATT 

13021 AAGAAAACAT TTAATTGAAA ATCATGAACT TTATAAGACA GAGCCTAAAG ACGGTAAAAT 

13081 TAGAATTACT ATGAAAGGTG GCGGCTACTA TACTTTTGAA TTAAATAAAA AATTACAGCC 

13141 TCATCGTATG GGTGATGTAA TTGATGGTAG AAATATAGAA AAAATTGAAG TC GAT T TATA 

13201 TTAATATTCG AGGGAGTATA TCATGAGGGA AAATTTTAAG TTACGTAAAA TGAAAGTCGG 

13261 GTTAGTATCT GTTGCAATTA CAATGTTATA TATCATGACA AACGGACAAG CAGAAGCATC 

13321 AGAGGCTAAT GAGAAGCCAA GTACAAATCA AGAATCAAAA GTTGTTTCAC AGACTGAACA 

13381 AAATTCAAAA GAAACAAAAA CAGTAGAATC TAATAAGAAC TTTGTTAAAT TAGATACTAT 

13441 TAAACCTGGA GCTCAAAAGA TAAC GGGAAC TACTTTACCA AATCACTATG TTTTATTAAC 

13501 AGTTGATGGG AAAAGTGCGG ATTCAGTAGA AAATGGCGGT TTGGGTTTTG TTGAAGCAAA 

13561 TGACAAAGGA GAATTTGAGT ACCCTTTAAA TAATCGTAAA ATTGTTCATA ATCAAGAAAT 

13 621 TGAGGTTTCG TCGTCAAGCC CTGATTTAGG TGAAGATGAA GAAGATGAAG AGGTGGAAGA 

13681 AGCTTCAACT GATAAAGCTG GC 



SEQ ID No. 106 



& aureus strain EMRSA 16(252) (SSL12-SSL14) 

1 CTACATTTAA AGTATAACTA TTTTTCAAGT AGTTAGAAAA TTCAATAAAT AACTAATTTA 

61 ACGCAATTAA TATCATAATT AGCATCAGTT TTAATTTATT AGCACTCTTT TCATTTTATA 

121 ATTGGCAATG TTCCAATGCT TAAAAATGAA AAATTTTACT TAAATCACAC GTTCAACTGT 

181 TTATGTTTAT TTAACATAGT GTTAGCTTTT ATTTAATTCC GAATCGGTGC AATACCTTAT 

241 ATACTATGGG TAATTCACGC AAAGGAGATT ATCATATGAA AAAGAACATC ACGAACAAAA 

301 TTGTTTTATC AACAGCATTG TTACTTTTAG GAACTACATC AACACAACCG CCTAAAGCAC 

361 CAATCAGTTT TTCATCTGAA GC AAAAG C AT ACAATATAAG TGAAAACGAA ACGAATATCA 

421 ATGAATTAAT CAAATATTAC ACGCAGCCAC ACCTTTCATT TTCTGGAAAA TGGCTATGGC 

481 AAAACCCTAA CGGTACCATA CATGCAACAT TACAAACGTG GGTTTGGTAC ACGCATATTC 

541 AAGTATTTGG TCCTGAAAGT TGGGGCAACA TTAATCAATT AAGAGATAAA TATGTAGATA 

601 TTTTTGGTAC CAAGGATGAG GAT AC T ATAG AAGGCTATTG GACTTATGAT GAAACCTTTA 

661 CTGGAGGTGT AACACCAGCA GCTACTTCAT CTGATAAGCC TTATAAATTA TTTTTAAAAT 

721 ATAAGGATAA ACAACAAACT ATGATTGGTG GACTTGAGTT TTATCAAGGT AATAAACCAG 

781 TAATAACTTT AAAAG AAC T A GATTTCCGCG TACGTCAAAC TTTAATTAAG AACAAAAAGC 

841 TATATAATGG AGAATTTAAT AAAGGTCAAA T TAG G ATT AC AGGTGGTGGT AATCATTATA 

901 CGATTGATTT GAGTAAAAAG TTAAAACTAA CTGACACAAA CAGTTATGTT AAAAATCCTC 

961 GACATGCAGA AATTGAAGTC ATTCTTGAAA AATCTCATTA GTATCATTCA TTGAGTGAAT 

1021 ACAAATAGAC TGAAT GATTT AATTGATTTC TCTCTTTTTA TAAACTTATT ATTCCTCGCA 

1081 TAT AAC ATT C CTCGTAAAGG AGATTTTCTA ATGAAAAACA ACCTCACAAA AAAAAT TAT T 

1141 TTATCAATAG CATTGACTAT GATAGGCACA GCTACAAGTC AATCCCCTCA TTCACCGATT 

1201 AGTATGTCAT CTGAAGCAAA AGCATACAAT AT AAGT G AAA GCGAAACGAA TGTAAATGAA 

1261 TTGACTAAAT ATTACACACA AC G C CAT TT A ACATTTTCAA ATAAATGGCT ATGGCAAAAA 

1321 GATAATGGTA CAATCCATGC AACATTGTTA CAGTTATCAT GGTTTAGTCA TATTCAAGTA 

1381 TTCGGCCCTG AAAGTTGGGG TAATATTAAT CAATTAAGAA ATAAATATGT AGATATTTTT 

14 41 GCTTTAAAAG ATTATGAAAC CTGGCGCACT TATATGTTAG CCCAAGAAAC ATTTACTGGC 

1501 GGTGTTACAC CAGTAGCAAA ACCTAGCGAT AAACATTATA AATTGAATGT AACATATAAA 

1561 GATAAAGCAG GAACATTTAT TGGCGAATAT CAATTTTATA CAGGTAATAA ACCAGTTTTA 

1621 ACTTTAAAGG AAGTT GATTT CCGAATCCGC CAAACACTAA TT AAAAAC AA AAAATTATAT 

1681 AATGGTGATT ATAATAAAGG CCACATTAAG ATTACAGGTG GCAGCAACAA CTACACAATA 

1741 GATTTAAGCA AAAG AT T AAA AT C AAC TG AT GCAAATAGAT ATGTTAAAAA CCCTCAAACT 

1801 GCACACATCG AAATCCTCCT TGAAAAATCT AGCTAACATC GAAATCGAAG TAAACAGAAA 

18 61 TAAGCACAAT GTTTACATTG ACGCCGTTCA TTTTTAAATT TGTTAATTCT CGTATTGATG 

1921 ATTACGCAGA AAGGAGAAGT TCGAATGAGA AAGTGCATTA TGAAAAAGTT AATTTTAATG 

1981 ACAACATTGT TATTATTAGG AACAACAGCT ACACAAACCT CTAATTCGCC ATTAAATGTT 

2 041 TCTACAGACG CTAAAGCTTA TCATATTGGC CAAGACGAAA CGAACATCAA TGAGTTAATT 

2101 AAATATTACA CACAACTGCC TCTCACATTT TCAAATAGAT GGTTATACCA ATATGATGAC 

2161 GGCAACATTT ATGTTGAATT TAAGCGTTAT TCTTGGTCAG CGCATATACG ATTATGGGGT 

2221 GCTGAAAGTT GGGGCAACGT TAATCAATTA AGAGACCGCT ATGTAGACGT GTTTGGTTTA 

2281 AAAGACGAAG ACACTAGCCA GTTATGGTGG GTATATCGTG AT AC ATT T AC AGGTGGTGTA 

2341 ACACCTGCTG CATCACCTTC TGATAAACCA TATAGTCTTT TTGTCCAATA CAAAGATAAA 

24 01 CTACAAACAA TTATTGGTGC ACATAAGATG TACCAAGGTA ATAAACCAAT ATT AAC ATT G 

24 61 AAAGAAATTG ATTTTCGAGC ACGTGAAACA CTAATTAAAA ATAAAATATT AT AT CAT GAA 
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2521 AATCGCAATA AAGGTAAACT TAAAATCACC GGTGGCGGCA ATGACTTTAC AATTGATTTG 

2581 AGTAAAAGAT TACATTCTGA TCTTGCAAAT GTTTATGTTA AAAATCCTCA AAAAATAACA 

2 641 GTCGAAGTCC TCATTGATTA AATAAGGTCG TAT GATTT AT ATCGTAAATA CTAAGGATAA 

27 01 ACAATATTTG AGTAATACTA CAAAATGTCT CAATAGTACT TACACAATCA TTAATTTAGT 

27 61 ACACAGAATA TTAAATGCGA TTTGGAGTCT AATAGAAATG CCAAGTCATT CTCTGGGTTT 

2821 GGGCCCGCCC CAACTTGCAT TGCTTGTAGA AATTGGTAAC CCAATTTCTC TATGTTGGGG 

2881 CCCCACCCCA ACTTACATTG T 



SBQ ID No. 107 

S aureus strain MW2 (SSL 1 -11) 



139021 
139081 
139141 
139201 
139261 
139321 
139381 
139441 
139501 
139561 
139621 
139681 
139741 
139801 
139861 
139921 
139981 
140041 
140101 
140161 
140221 
140281 
140341 
140401 
140461 
140521 
140581 
140641 
140701 
140761 
140821 
140881 
140941 
141001 
141061 
141121 
141181 
141241 
141301 
141361 
141421 
141481 
141541 
141601 
141661 
141721 
141781 
141841 
141901 
141961 
142021 
142081 
142141 
142201 



taaaaattat 
caggtgtaat 
gtgaggctga 
gatatggatt 
ataaaatatc 
tagatgtgtt 
gcataacaaa 
tcaagaaagg 
cactaaaaga 
caaatggtct 
tcgatttaag 
tacaaaaaat 
agtgcttgtt 
tgaggggaaa 
atagcttaat 
ctcctataat 
tggagcatac 
tagcaacagg 
taataagtga 
aaaatgtgac 
atggtaattc 
gtgtacatcg 
agtactccat 
caagagttat 
tagataaaga 
ataattatgg 
acggaaagta 
tcaatgtcac 
tgacgacgaa 
acttcttttt 
taatcactca 
gtttaatgtg 
aaaagacaat 
taaaaccagt 
ggtcaaagca 
gcgattagca 
cacaagacaa 
ctcagcttcc 
tatatcagca 
gcagcaaact 
taaatcagac 
atatgaagat 
atttttgctc 
ctataaaata 
tatcgatgta 
tggcatcacg 
aaaagataat 
gatttccttg 
ttacggtaac 
gtttgaatta 
tattgataac 
tcggaaaaac 
cgtaatgatg 
aaagtgacga 



gaaatttaaa 
tacatcgaat 
tttaaaactt 
tattgaaggt 
tttagttggt 
tgtcgttaga 
aactaataga 
taaagaagaa 
acttgatttt 
taaacaaggt 
tcaaaaactt 
tctagtagaa 
atggaaatat 
agttacaaaa 
catattttat 
atgggtgtat 
agctatgaaa 
tgtaaacact 
aaatagcaaa 
aggttatatc 
tgttaataat 
taatcttaat 
tggcggtatc 
taaagcagat 
agcaatgtca 
tctttacggt 
tacatttgaa 
agatattgat 
ataatttgaa 
aatgcttaaa 
ttattttttg 
attatcttag 
cgaatataat 
ttagcactag 
gaaaaaatac 
atgataaaca 
gaacgcacgc 
aaaatagaaa 
acgccagcgc 
aaaatgacaa 
acaccacaat 
ttaagagcgt 
aaaccatgga 
gctttagttg 
tttatcgttt 
aagactaata 
caaggtatga 
aaagagcttg 
atgggttcag 
cacaaaaaac 
attgaagtga 
aagaagttaa 
taaaaagacg 
aacttcaaat 



gcgatagcaa 
gtacaatcag 
tattataatg 
aaagatagat 
tctgataaag 
gaaggaaacg 
ggagtgtatt 
ccacaaagta 
aaattaagaa 
caaattacaa 
gaaaaagaac 
atgaaataat 
atggaagtta 
ctcacacaaa 
gaatagttaa 
ggttcaaatt 
atgaaatcaa 
acaacggaaa 
aaattaaaag 
agtttcattc 
attgctttaa 
atattttacg 
acgagtgcaa 
catattggtg 
ttgaaagaga 
gaaatgagta 
ttggataaaa 
agaattgaaa 
attgaaatag 
aatcatttca 
cttaaattac 
aatgccatct 
atagattgga 
ggcttttaac 
aatcaactaa 
taacagcagg 
ctaaactcga 
aaatatcaca 
ctaaacaaga 
cacctccatc 
ctccaaccat 
attacacgaa 
cgacggttag 
gaaaagatga 
tagaagacaa 
gtaaaaaagt 
tttcacgcga 
attttaaatt 
gaacaatcgt 
tgcaagagca 
atataaaata 
gtgacaacgg 
aatattcatt 
gtgccaagtg 



aagcaagttt 
tacaagcgaa 
gaccaagttt 
ttattgattt 
ataaatataa 
gtagacaagc 
atgactatat 
gtctatacca 
agcaattaat 
ttacaatgaa 
gtatgggcga 
actttctaac 
agcgacgtac 
cagtcgcacc 
aaacaggtta 
acgtaataaa 
ttgtaaaaat 
aaccagttca 
cttattatac 
aaccaagtat 
ttggcaaaga 
ttaatgagga 
acgataaagc 
aatatgatta 
ttgattttaa 
cagggaaaat 
agttacaaga 
tcaaagttag 
agaggttaag 
aaggcacata 
ttaataatac 
ataatgatgt 
gcatacaatt 
aacaggcgca 
agttgacaaa 
tgcaaattca 
aaaggcacca 
acctaaacaa 
acaatcacaa 
aacaaacacg 
aaaacaagca 
accgagtttt 
gtttatgaat 
gaaaaaatat 
taaatatcaa 
taatcacaaa 
tgtttcagaa 
gagaaaacaa 
tattaaaatg 
tcgtatggca 
atcatgacat 
tttacatgtt 
tgtttgtaaa 
ttgaatcaca 



agcattggga 
aacagaagtt 
tgaatataaa 
tatatacaat 
tgaagaagtt 
tgataatcat 
acacacacca 
aatttataaa 
tagtcaaagt 
tgatggcaca 
gtctatcgat 
aacaaagcgc 
tgttgcttag 
acgcattatc 
atgtgaatat 
acaatctaat 
aagtttgtta 
tgccgaaaag 
tcaacctagt 
taaatttatg 
taagcaacat 
taagagattt 
tgtcgaccta 
tgactttttc 
attaagaaaa 
taccgtcaaa 
agaccggatg 
aaaagcataa 
tgacgatcaa 
gaaacgctat 
ttcaataatt 
tgtatgattc 
atgaaaatga 
attacagtaa 
gtaccaacgc 
gcgacaacac 
aatactaatg 
gaagagcaga 
acgacaaccg 
ccacaaccaa 
caaacagata 
gaatttgaaa 
gttattccaa 
aaagatggac 
ttgaaaaaat 
gtagaattaa 
tacatgatta 
cttattgaaa 
aaaaacggtg 
gacgtcatag 
tctctaaata 
gcttagcttc 
agtggcattt 
tcaaaatcat 



atgttagcaa 
aaacaacaaa 
aaagtaactg 
ggacaatata 
aacccagata 
tcgattggtg 
atccttgaaa 
gaagacatct 
ggcttgtatt 
acacatacaa 
ggcagacaac-. 
tatgttgaa'. 
cttcttttt .. 
ttttgcttaa 
ctgaatacag 
tataatagat 
ttaggaatat 
aaacctattg 
attgaatata 
aatatcatag 
tatcatacgg 
gaaggtgcaa 
atagcagaag 
ccatttaaaa 
taccttattg 
aagaaatact 
tccgatgtta 
cacacatact 
acgttgctta 
attaacctca 
gttaaaaggg 
aaattacgta 
gaacaattgc 
cgacgcaatc 
ttaaagcaga 
aagcagctaa 
aggaaaaaac 
aaacgcttaa 
aatctacaac 
tgcaatctac 
tgactcctaa 
agcagtttgg 
ataggttcat 
cttacgataa 
attctgtcgg 
gcattactaa 
ctaaggaaga 
aacataatct 
ggaaatatac 
atggcactaa 
gaagctgtca 
ttttattatg 
ctatgtctta 
ttttatttaa 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



142261 
142321 
142381 
142441 
142501 
142561 
142621 
142681 
142741 
142801 
142861 
142921 
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146161 tgaattcagt aatattagtg gtaaggttga aaattataac ggttctaacg ttgtacgctt 
14 6221 taaccaagaa aatcaaaatc accaattatt cttatcagga aaagataaag ataaatataa 
14 6281 agaaggcctt gaaggccaga atgtctttgt ggtaaaagaa ttaattgatc caaacggtag 
14 6341 actatctact gttggtggtg taacgaagaa aaataaccaa tcttctgaaa ctaatacacc 
5 146401 tttatttata aaaaaagtgt atggcggaaa tttagatgca tcaattgaat catttttaat 
14 64 61 taataaagaa gaagtttcac tgaaagaact tgatttcaaa attagacaac atttagttaa 
14 6521 aaattatggt ttatataaag gtacgactaa atacggtaag atcactttca atttgaaaga 
14 6581 tggagaaaag caagaaattg atttaggtga taaattgcaa ttcgagcaca tgggcgatgt 
14 6641 gttgaatagt aaggatattc aaaatatagc agtgactatt aatcaaattt aaagtaagta 
10 146701 atcaatgact ctaaagtaat aaatttgaag cagcttagcg atgaaatgtt gaatagatac 
14 67 61 gtacacctta cataaaggag cgtatttaaa acaaccttgt cgttaggctt tttttacgtt 
14 6821 ttataacgca gggtatgagc gtactaaaaa ttcacattac ttctgaaagt gatgtccatt 
14 6881 gaatattaat tagttcttca ttaaccatga tttaatttta attaaacgag tgttaatgtc 
146941 agtctgtctc aatgcccttt ataataaatg tgtattattc aaattacgta ataaaagcaa 
15 147001 tccaatatat taagattgga gcatatgaat atgaaattta cagcgatagc taaagcgata 
147061 tttgtattag gaatattaac aacaagtgta atgataacag aaaatcaatc ggttaatgca 
147121 aaaggaaagt atgaaaaaat gaaccgttta tatgatacaa acaagttaca tcaatactat 
147181 tcaggaccta gttatgagtt aacaaatgtt agtggccaaa gtcaaggtta ttatgactct 
147241 aacgttttgc tttttaacca acaaaatcaa aagttccaag tgtttttatt gggaaaagat 
20 147 301 gaaaataaat acaaagaaaa aacacatggt ttagatgtct ttgcggtacc ggaattagta 
147361 gatttagatg gaagaatatt tagtgttagt ggtgtaacaa agaaaaatgt aaaatcaata 
147421 tttgagtctc taagaacgcc gaacttacta gttaaaaaaa tagacgataa agacggtttt 
147481 tcgtatgatg aatttttctt tattcaaaag gaagaagtat cattgaagga acttgatttc 
147 541 aaaataagaa aactgttaat taaaaaatac aaattgtatg aaggggcagc tgataaaggt 
25 147 601 agaattgtta ttaatatgaa agatgaaaat aagtatgaaa ttgatttaag tgataaatta 
147 661 ggtttcgagc gtatggcaga tgtcattaat agtgaacaaa ttaaaaacat cgaagtgaat 
147721 ttgaaataat caatgatata tatagaatga aagcttaaga agcggtttaa taatcccatg 
1477 81 tttaatgatt ttgatacgtg ttttaataat aaaaacatat cgaacattga ctacgttatt 
147841 aagctgcttt tttgtacact ttgtatcgaa taacttaaga tctaaaacta atcggaaaga 
30 147 901 acaatgattc ccctaaaaaa atttatgttg ctattaaaaa tcagttaata cgaatgttaa 

147 961 catacgtttg attttcatta ataatgattc aagtttattt aaatgagcgt taatgtcagt 
148021 ctgttttgat gcaccttata ataaagacag atagttcaaa ttacgtaata ataacaatcc 
148081 aatatatcaa gattggagca aataaatatg aaatttacag cattagcaaa agcaacatta 
148141 gcattaggaa tattaactac aggtgtgttt acaacagaaa gtaaagctgt tcacgcgaaa 

35 148201 gtagaacttg atgagacaca acgcaaatat tatatcaata tgctacatca atactattct 
148261 gaagaaagtt ttgaaccaac aaatattagt gttaaaagcg aagattacta tggctctaac 
148321 gttttaaact ttaaacaacg aaataaagct tttaaagtat ttttacttgg tgacgataaa 
148381 aataaatata aagaaaaaac acatggcctt gatgtctttg cagtacctga attaatagat 
148441 ataaaaggtg gcatatatag cgttggcggt ataacaaaga aaaatgtgag atcagtgttt 

40 148501 ggatttgtaa gtaatccaag tctacaagtt aaaaaaatcg atcctaaaca tggcttttcg 
148561 ataaatgagt tgttctttat tcaaaaggaa gaagtatcgt tgaaggaact ggattttaaa 

148 621 ataagaaaaa tgttagtcga aaaatataga ttgtataaag gcgcgtcaga taaaggtaga 
148681 atcgttatta atatgaaaga cgaaaagaaa tatgtaattg atttaagtga aaaattaagt 
14 87 41 tttgatcgta tgtttgatgt aatggatagt aagcaaatta aaaatattga agtgaatttg 

45 148801 aattaattaa gtataataac ttaagaagcg acttaacgac aaaatgtgaa ttgacatgca 
1488 61 tgtccttaaa taaggaactg tgttaaatac attactgttg ttaagttgtt ttttgcgttt 
148921 caaagagcag aacagagtaa catcatcagt tgtagtaaac gataatctag taaaacaact 
148 981 aaatgaaata atgaaattca tttaacctga acattaaaat atatttgttt ttcattaaga 
149041 ataattcaag tatatttaaa tcgaggttaa ttatcgtatg aaacgatgca cgttataata 

50 14 9101 aaaatgtatg attcaaatta cgtaatgaaa acaatccaat atattaagat tggagcaaaa 
149161 caatatgaaa ttaacagcga tagctaaagc tgcattagct ttaggaattt taacaacagg 
14 9221 aactttaaca acagaagttc attcaggtca tgcaaaacaa aatcaaaagt cagtaaataa 
149281 acatgacaag gaagcattat accgatacta cactggaaag actatggaaa tgaaaaatat 
14 9341 tagtgctttg aaacatggta aaaataactt gcgttttaag tttagaggta ttaagattca 

55 14 9401 agttttactg cctggaaatg ataaaagtaa atttcaacag cgtagttatg aggggttaga 
14 94 61 tgtgtttttt gttcaagaaa aaagagataa gcacgatata ttttatactg ttggtggtgt 
14 9521 aatacagaat aataaaacat ctggagttgt cagtgcacca atattaaata tttcaaaaga 
14 9581 aaagggtgaa gatgcttttg tgaaaggtta cccttattac attaaaaaag aaaaaataac 
14 9641 actaaaagag ctggattata agttgagaaa gcatctaatc gaaaaatatg gactttataa 

60 14 9701 aacaatctca aaagatggta gggtcaaaat tagcttgaaa gatggcagtt tttataacct 
14 97 61 tgatttaaga tctaaattaa aattcaaata tatgggggaa gtcatagaaa gcaaacaaat 
149821 taaagatatt gaagttaact taaagtaaat aattacgaat aataaaaagt aattgaagcg 
149881 gcttaacgat gaaaagtaaa ttgatgcgca taccttacca aaaggatgca tcaatcgata 
14 9941 tcgtcgttaa gctgttttgg tttacgtttc atggattcta ccccaatttt cataaatata 

65 150001 aaaattccac caccaacatc aaaattctca acatcgcaac atacccaaat gttataataa 



53 

SDOCID: <WO 200509291 BA2J_> 



WO 2005/092918 



PCT/GB2005/00 1 084 



150061 atctattaca caaagagata aattacttat tcaaaggcgg aggaatcaca tgtctattac 
150121 tgaaaaacaa cgtcagcaac aagctgaatt acataaaaaa ttatggtcga ttgcgaatga 
150181 tttaagaggg aacatggatg cgagtgaatt ccgtaattac attttaggct tgattttcta 
150241 tcgcttctta tccgaaaaag cagaacaaga atatgcagat gcgttggcag gtgaagatat 
5 150301 cacgtatcaa gaggcatggg cagatgaaga atatcgtgaa gacttaaaag ctgaattaat 
1503 61 tgatcaagtc ggttacttca ttgaaccaca agatttattc agtgcgatga ttcgtgaaat 
150421 tgaaacgcaa gatttcgata tcgaacatct ggcgacggca attcgtaaag ttgaaacatc 
150481 aacattaggt gaagaaagtg aaaatgactt tatcggactg ttcagcgata tggacttaag 
150541 ttcaacgcga ctaggtaaca atgtcaaaga acgtactgct ttaatctcta aagtcatggt 

10 150601 taatcttgac gatttaccat ttgttcacag tgatatggaa attgatatgt taggtgatgc 
150661 atatgaattc ctaattgggc gctttgcggc gacagcaggt aaaaaagcag gcgagttcta 
150721 tacaccacaa caagtatcta agatactggc gaagattgtc acagacggta aagataaatt 
1507 81 acgtcatgtg tacgacccaa catgtggttc cggttcatta ttgttacgtg ttggtaaaga 
150841 aacgcaagtg tatcgttatt tcggtcaaga acgtaacaat accacttaca acttagcacg 

15 150901 catgaacatg ttattacatg atgtacgtta tgaaaatttc gatatccgta atgatgacac 
150961 gttggaaaat ccagcctttt taggacatac atttgatgcg gttattgcga acccaccata 
151021 cagtgcgaaa tggacagcag attcaaaatt tgaaaatgac gaacgattca gcggatacgg 
151081 caaacttgcg ccaaagtcca aagcagactt tgcctttatt caacacatgg tacattactt 
151141 agacgatgaa ggtaccatgg ccgttgtact cccacatggt gtcttattcc gtggtgctgc 

20 151201 agaaggtgtc attcgtcgtt atttaattga agaaaagaac tacttagaag ccgtgattgg 
1512 61 cttaccagcc aatattttct atgggacaag tattccaaca tgtattttag tatttaaaaa 
151321 atgtcgccaa caagaagact atgtattatt tatcgatgca tccaatgatt ttgaaaaagg 
151381 aaaaaatcaa aaccatttaa ccgatgccca agtcgaacgc attattaaca catataagcg 
151441 taaggaaaca attgataaat atagctacag tgcgacatta caagagatcg ccgataacga 

25 151501 ttacaactta aacataccga gatatgtcga tacattcgaa gaagaagcac cgattgattt 
151561 agatcaagtc caacaagatt tgaaaaatat cgacaaagaa atcgcagaag ttgaacaaga 
151621 aatcaatgca tacctgaaag aacttggggt gttgaaagat gagtaataca caaaagaaaa 
151681 atgtgccaga gttgaggttc ccagggtttg aaggcgaatg ggaagagaag aagttagggg 
151741 accttactac caaaataggt agtggaaaga ctcccaaagg tggaagtgaa aactatacaa 

30 151801 acaaaggcat accattttta aggagtcaaa atattagaaa tggtaaatta aatcttaatg 
1518 61 acttagttta tattagtaaa gatatagatg atgagatgaa aaatagtaga acgtactatg 
151921 gtgatgttct tttaaatatt acaggagcat caataggtag aacagccatt aattcgatag 
151981 ttgaaataca tgctaattta aatcaacatg tatgtattat tagattgaaa aaagagtatt 
152041 attataattt ttttggacag tatctattat caagaaaagg taaaaggaaa attttccttg 

35 152101 cacaaagtgg aggtagtcga gaaggactaa acttcaaaga aattgctaat ttaaaaatct 
152161 tcaccccaac tatatttgaa gagcagcaaa aaataggcga attcatcagc aaacttgacc 
152221 gacaaattga attagaagaa caaaaacttg aattacttca gcaacagaaa aaaggctata 
152281 tgcagaaaat cttctcgcaa gaattgcgat tcaaagatga ggaaggtaaa gattatccag 
152341 attggaaatc aaaatcaatt caagaaatat ttgagaataa gggtggcact gctctagaaa 

40 152 4 01 cagaatttaa ttttgacggt aattataaag ttataagtat aggaagttat tctataaata 
1524 61 gcacttataa tgatcaaaat ataagagtca ataaaaataa aaaaactgaa aaatatattt 
152521 tatcaaaagg cgacttagca atggtattaa atgataaaac aaaagatggg aaaattatag 
152581 gtagaagtat atttatagat aaagataatc aatatattta taatcaaaga actgaaagat 
152 641 taataccatt tgctgaaaat gataataaat ttttatggtt cttaatgaat acagatttaa 

45 152701 ttagaaataa aataaaaggt atgatgcaag gagcaaccca agtttatata aattattcat 
1527 61 ctattaaatt gatatctata caattgccac ttcttgaaga acaacagaaa ataagagggt 
152821 ttctagaagt tttatctgga ataactacta aacaattgca caagatagac caattaaaag 
152881 agaggaaaaa ggcgttttta cagaaaatgt ttatttgatt tgtcgctgca atatagtttt 
152 941 tattatttgt ttatttcaga tgtttcacca tcatattgcg taacttttac aaataagaaa 

50 153001 taaagttcaa tgaaatcaaa aacgaacatt aaatttaggc actgtgatag cacagtgtct 
153061 tttttgtgtc gaaattgtgt acagaataag tagttaaata aagattaagt tgagataaag 
153121 tgttattcgt aaataaaaga gagtagatcg ataggaattg aatgatatta gttaactatt 
153181 tattaaatta cttaataatg attaattttt agttaaagta agtttaatgt gaagcacgac 
153241 cattgctcat tataatgaat gaggattgtt cgtattgcgt aatagaataa atcaaataga 

55 153301 ctaaaaattg ggagcataga attatgaaat taaaaaatat tgctaaagca agtttagcac 
153361 tagggatttt aacaacaggg atgattacaa ctactgctca gccagtaaaa gcaattgagc 
153421 aaagcagatt atcagttact tcaaaagata cacaagaatt aaaaaaatac tacagtggaa 
153481 caggatataa ttttcaaaat gtgagtggtt atagagaagg taataaaatg aacattattg 
153541 atggaccaca acttaatgta gttactttac ttggcacaga caaagaaagg tttaaggacg 

60 153601 atgaagatta tgaaggactt gatgtatttg ttgtaagaga agggtcaggt aaacacgcag 
153661 ataatatatc aattggtgga attacaaaaa caaataagaa tcaatataaa gaccctgtac 
153721 aaaacgttaa tttattgact tctaagagta acggtcaaaa tactgcttct gtgacttcag 
1537 81 aatactatag catcaataaa gaagaaattt cattaaaaga acttgatttc aaactaagaa 
153841 agcaattaat tgataaacat gatctttata agacagagcc taaagacagc aaaattaaag 
65 153901 tttctatgaa aaatggcggc tactatacgt ttgaattaaa taaaaaatta cagcctcatc 
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153 961 gcatgggtga tacgattgat agtagaaata taaagaaaat tgaagtgaat ttataataat 

154 021 attcgaggga gtatatcatg agagaaaatt ttaaattacg taaaatgaaa gtcggtttag 
154 081 tatcagttgc aattactatg ttatatatca tgacaaatgg cgaagcagaa gcatctgaag 
154141 gtagccaaac tgttaagaac ccaaaggtga atgcaactga agaaatcaaa gttggatcac 

5 154 201 aaccagttca aaataatcaa gaagtaagtt cggaacaaac aaaaaagaat tttgttaatt 
154261 tagaccccat taaacctggt gctcaaaagg taacagggac tactttaccc agccatatta 
154 321 ttctaatgaa tatagatggt aaaagtgctg attcagtaga tggaggaaat agtgatttag 
154381 tatttgctga tgaaaacgga agattcgagt atccactaca caatagaaaa attgttcata 
154441 atcaagaaat tgaggtttcg tcatccagtc ctgatttagg tgatgatgaa gaagatgaag 

10 154 501 aagtagaaga agattcaact gaaaaagctg gtactgagga agaaaacaca gatgctaaag 
154 561 ctacatacac aacaccacga tatgaaaaag cgtatgaaat accgaaagaa caactaaaag 
154 621 aaaaagatgg acatcaccaa gttttcatcg aacctattac tgaaggttca ggtattatta 
154 681 aaggtcatac gtctgtaaaa ggtaaagttg ctctatctat taataataaa tttattaatt 
154741 ttgaagaaag agctaaagat ggaattagta aagaagatac taaagctagt tccgatggtg 

15 154801 tttggatgcc tattaatgaa aaaggatatt ttgattttga tttcaaaaaa aatccttttg 
154861 ataacttaga gttaaagaaa aatgatgaaa tctcattaac atttgcacct gatgatgaag 
154 921 atgaagcatt gaagtcatta attttcaaaa ctaaagtaac gagtttagaa gatattgata 
154 981 aagcagaaac taaatatgac catactaaag tggaaaaagt aaaagtattg aaagatgtta 
155041 aagaagattt acatgtagat gaaatttatg gaagtttgta ccatacagaa caaggtaaag 

20 155101 gtattctcga taaacaggga actaaagaaa ttacaggtaa gactaaattc gcgaatgcag 
155161 tagtgaaagt atattctgac ttaggtgatg cgcaactgtt tcctgatatt caagtagatg 
155221 aaaatggtaa atttagcttt gatgctgaaa aagctggttt cagattacaa aatggagaaa 
155281 cattgaattt tgcagtagtt aaacctatta ctggtgagct attacatcaa ggattcgttt 
155341 ctaagtatat cgatgtttat gaatctccgg aagaaaagaa agaacgtgaa tttgaagaga 

25 1554 01 aacttgaaaa cacgcctgca tatcataaat tacatggtga taaaattgtc ggctatgatg 
1554 61 ttcaaggtaa tccatcaact tggttctatc cattaggtga aaagaaagtt gaacgtaagg 
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